The Big Rewards of Big Data
A problem well defined is a problem half-solved. —John Dewey
Up to this point, we’ve defined Big Data and
its elements. We then described many of the technologies that
organizations are using to harness its value. Now it’s time to see some
of these technologies in action. This chapter examines three
organizations in depth, exploring how they have successfully deployed
Big Data tools and seen amazing results. Let’s start with a company that
makes handling Big Data its raison-d’etre.
How do advertisers reach their target
audiences online? It’s a simple question with anything but a simple
answer. Traditionally, advertisers reached audiences via television
based on demographic targeting. As discussed in Chapter 2, thanks to the
web, consumers today spend less time watching TV broadcasts and more
time in their own personalized media environments (i.e., their own
individual blogs, news stories, songs, and videos picked). While good
for consumers, this media fragmentation has scattered advertisers’
audiences. Relative to even twenty years ago, it is harder for them to
reach large numbers of relevant consumers.
But just as the web lets consumers choose media more
selectively, it lets advertisers choose their audiences more
selectively. That is, advertisers need not try to re-create the
effectiveness of TV advertising; they can surpass it. For example, an
hour-long prime-time show on network TV contains nearly 22 minutes of
marketing content.
1
If advertisers could precisely target consumers, they could achieve the
same economics with just a few minutes of commercials. As a result, TV
shows could be nearly commercial free. Ads on the web are individually
delivered, so decisions on which ad to show to whom can be made one
consumer at a time.
Enter Quantcast.
Founded in 2006 by entrepreneurs Konrad Feldman and Paul
Sutter, Quantcast is a web measurement and targeting company
headquartered in San Francisco, California. Now with 250 employees,
Quantcast models marketers’ best prospects and finds similar or
lookalike audiences across the world. Connecting advertisers with their
best customers certainly isn’t easy, never mind maximizing yield for
publishers and delivering relevant experiences for consumers. To do
this, Quantcast software must sift through a veritable mountain of data.
Each month, it analyzes more than 300 billion observations of media
consumption (as of this writing). Today the company’s web visibility is
second only to Google. Ultimately, Quantcast attempts to answer some
very difficult advertising-related questions—and none of this would be
possible without Big Data.
I wanted to know more about how Quantcast specifically
uses Big Data, so I asked Jim Kelly, the company’s VP of R&D, and
Jag Duggal, its VP of Product Management. Over the course of a few
weeks, I spoke with them.
Steps: A Big Evolution
It is not the strongest of the species that
survives, nor the most intelligent that survives. It is the one that is
the most adaptable to change. - Charles Darwin
Quantcast understood the importance of Big Data from its
inception. The company adopted Hadoop from the get-go but found that
its data volumes exceeded Hadoop’s capabilities at the time. Rather than
wait for the Hadoop world to catch up (and miss a potentially large
business opportunity in the process), Quantcast took Hadoop to the next
level. The company created a massive data processing infrastructure that
could process more than 20 petabytes of data per day—a volume that is
constantly increasing. Quantcast built its own distributed file system
(a centerpiece of its current software stack) and made it freely
available to the open source community. The Quantcast File System
2
(QFS) is a cost-effective alternative to the Hadoop Distributed File
System (HDFS) mentioned in Chapter 4. QFS delivers significantly
improved performance while consuming 50 percent less disk space.
3
In a Big Data world, complacency is a killer. New data
sources mean that the days of “set it and forget it” are long gone, and
Charles Darwin’s quote is as relevant now as it was 100 years ago. In
2006, like just about every company in the world, Quantcast practically
ignored data generated from mobile devices. Most Internet-related data
originated from desktops and laptops before iPhone and Droids arrived.
Of course, that has certainly changed over the past five years, and
Quantcast now incorporates these new and essential data sources into its
solutions.
4 This willingness and ability to innovate has resulted in some nice press for the company. In February 2010,
Fast Company ranked Quantcast forty-sixth on its list of the World’s Most Innovative Companies.
5 To this day, the company continues to expand and diversify its analytics products.
From its inception in 2006, Quantcast focused on
providing online audience measurement services, a critical part of the
advertising industry for both advertisers and publishers. TV and radio
stations need to use a mutually agreeable source for determining how
many people they are reaching. Companies like Arbitron and Nielsen had
provided similar services for radio and TV for decades. These companies
used panels of users to extrapolate media consumption across the entire
population.
For the most part, these companies’ Small Data
approaches consist of simply porting their panel-based approaches to the
Internet. As discussed in earlier chapters, Small Data tools and
methods typically don’t work well with Big Data, something that
Quantcast understood early on. It built a Big Data–friendly system
tailored to the web’s unique characteristics. Millions of popular sites,
social networks, channels, blogs, and forums permeate the web.
Consumption is fragmented, making extrapolating from a panel extremely
difficult. Luckily, since each web page is delivered individually to a
user panel, such extrapolation is unnecessary. On the web, Quantcast
measures the “consumption” of each page directly.
Buy Your Audience
In 2009, Quantcast began development of an
“audience-buying” engine. With it, the company could leverage its vast
troves of consumer data on online user media consumption. As real-time
ad exchanges such as the DoubleClick AdExchange arose, Quantcast quickly
got on board. Today, Quantcast is a major player in a market that
auctions off billions of ad impressions each day.
In November 2012, Quantcast released Quantcast
Advertise. The self-service platform enables advertisers, agencies, and
publishers to connect Big Data with discrete brand targets.
6
With the right solutions, Big Data allows organizations to drill down
and reach very specific audiences. “A flexible compute infrastructure
was critical to our ability to produce more accurate audience
measurement services. That same infrastructure produced more accurate ad
targeting once ad inventory started to be auctioned in real-time,”
Duggal told me.
We saw earlier in this book how Amazon, Apple, Facebook,
Google, and other progressive companies eat their own dog food. Count
Quantcast among the companies that use its own Big Data tools. What’s
more, like Google, Quantcast makes some of its own internal Big Data
solutions available for free to its customers.
7
Quantcast audience segments allow users to understand and showcase any
specific audience group for free. Once implemented, these segments
appear in users’ full publisher profile on Quantcast.com. As a result,
they can better represent their audiences.
Figure 5.1 shows some sample data from its Quantified dashboard.
To be sure, “regular” web traffic, click-through, and
purchase metrics might be sufficient for some business. However,
Quantcast knows that it can’t serve myriad clients across the globe with
a mind-set of one size fits all. No one
company can possibly predict every Big Data need. Different businesses
face vastly different data requirements, challenges, and goals. To that
end, Quantcast provides integration between its products and third-party
data and applications. What if customers could easily integrate their
own data and applications with Quantcast-generated data? What if its
clients wanted to conduct A/B testing, support out-of-browser and
offline scenarios, and use multiple, concurrent analytic
services—without impacting performance?
“Integration is central to everything we’re doing here,”
says Kelly. “It’s the source of all the data we work with and the means
by which it becomes relevant to the world.” And that advanced
integration isn’t stopping anytime soon. Case in point: Quantcast
created and offers an API built off the Microsoft Silverlight Analytics
Framework.
8
Results
Consider the following results from some of Quantcast’s recent customer campaigns:
- A national after-market auto parts retailer relied upon digital
advertising to attract new customers and drive online sales. Quantcast
built predictive models to convert customers who had actually completed
an online purchase, distinguishing between passersby and converting
customers. The campaign all but eliminated the majority of superfluous
clicks, achieving a return on investment (ROI) greater than 200 percent.
- A major wireless phone company achieved a 76 percent increase in
conversion rates above its optimized content-targeted campaign.
Quantcast lookalike data allowed lead generation to garner significantly
higher conversion rates over content-targeted inventory purchased from
the same inventory sources.
- A leading hotelier gained deep insights into the demographic,
interests, behaviors, and affinities of its customers. In the process,
it ultimately doubled its bookings.
Lessons
Compared to many organizations, Quantcast is a
relatively small company. This proves the point that an organization
doesn’t need to be big to benefit from—and innovate with—Big Data.
There’s no secret sauce, but embracing Big Data from its inception
starts a company on the right path. Also, it’s critical to realize that
Small Data tools just don’t play nice with Big Data. Understand this,
and then spend the time, money, and resources to equip your employees
and customers with powerful self-service tools.