By Deb Ray, Chief Data Officer at VideoAmp
Advertising done right, ignites the economic fuel to drive content creators to produce high-quality, innovative forms of content that resonate with viewers. Out of all industries, advertising has been one of the most impacted by the use of Big Data, AI and Machine Learning. John Wanamaker, a renowned merchant in the early 1900s, quipped “Half the money I spend on advertising is wasted; the trouble is I don’t know which half.” If he lived in today’s world, he would see that every ad impression is tracked to measure how the consumer views and engages with it, and how it leads to conversions.
Advertising technology is extremely competitive, and the industry is quick to adopt the latest innovations from Machine Learning and AI in its arsenal to extract more ROI out of ad dollars spent. Some of the largest and fastest growing companies, like Google, Facebook, Snap, derive their revenues from advertising. Every bit of improvement eked out can impact millions of dollars.
It is the same competitive forces that have driven traditional media, like linear TV, billboards and radio to adopt new, precise measurement techniques. And overhaul the transactional marketplace and adopt programmatic bidding, where each impression is bid and won in milliseconds. For example, Linear TV measurement where a panel of 50,000 estimates what the nation is consuming, is being replaced by smart TVs (like Vizio) or set-top box data collection solutions (like Comscore) to bring massive efficiencies in the measurement and buying of media.
At VideoAmp we recognized over two years ago that all media channels, digital video, audio, linear TV, billboards are moving towards programmatic buying. That’s why we created a platform for the largest brands and agencies to transact across all media channels seamlessly. Our thesis is that what matters is advertisers reach the right audience, at the right time, at the right price, agnostic of the channel.
At the heart of efficient cross-channel advertising lies very high volume, low latency and high frequency data on impressions. For video alone, it is 300,000 queries per second (QPS), and more if you include display. For each impression, a decision has to be made on whether to bid on that impression (from that user ID, device, location, and 50 other variables), and if so how much, all within 50 milliseconds. Furthermore, we limit the number of impressions, or choose the appropriate creative to serve, based on the user ID.
Cross-channel advertising strategy typically requires us to limit the number of impressions served to a user on a single device, and in many cases sequentially across channels. Good brand advertising is short-form storytelling, with one creative being shown on TV, followed by the user shown the follow on message on radio, finally offered a coupon on their mobile phone as they near the store location.
The complexity of cross-channel targeting and measurement is solved by building a large-scale graph of consumers and their connected devices. This is where recent advances in Machine Learning come into play. The construction of the graph requires grouping billions of cookie IDs and device IDs with deterministic (e.g. IP address) and non-deterministic signals, based on behavior. Since each channel has different schema, frequency and type of data, feature engineering is obviously difficult. This is where we have found a lot of success by utilizing Deep Learning methods, that learn the underlying data representation.
Once the graph is constructed, we can run a variety of complex queries that are solved via graph inference algorithms. For example, we may want to know, “What are some emerging audience and demographics for a newly released product?”, or “Which sites should I advertise on to drive the most number of shows to a pilot TV show?” This scale of graph construction and rapid inference has been made possible only through the recent developments of Apache Spark, a distributed in-memory computational framework.
These are exciting times for AI in the field of advertising technology. The really large scale, complexity, low latency pose some of the hardest engineering and data science challenges. And the competitive nature of the industry means that decisions on technology are driven by what gives the highest performance gain.
View the published article on InsideBigData.com