How Gracenote Became a Big Metadata Player

Gracenote Rhythm is a lot like Pandora with a RESTful JSON API.

Last week, Tribune Company completed the $170 million acquisition of Gracenote, a deal originally set in motion in late 2013. The merger is an unusual one: Gracenote owns a massive library of media metadata, and the Tribune Company is best known as the publisher of print newspapers and tabloids, most notably its flagship paper in Chicago. Five years ago, Tribune Company filed for bankruptcy as advertising revenues declined, the result of the global recession; at the time, Gracenote had just been acquired by Sony for almost $100 million more than its most recent price.

The acquisition places Gracenote directly alongside Tribune Media Services (TMS), a division within Tribune Company that syndicates both editorial content and and media metadata such as movie showtimes and television listings. Gracenote’s data is similar, although more focused on individual consumption habits rather than mass presentation. “Gracenote’s video products have been using TMS data for some time, re-distributing their data as part of our products,” said Gracenote president Stephen White, “so it was a pretty natural fit.”

That fit wouldn’t have seemed so natural for the earlier iterations of Gracenote, which got its start in the 1990s (originally under the rather awkward moniker CDDB) and was, initially, a mechanism for redistributing track listings from compact discs over the Internet, thereby saving users the trouble of typing in all the album data when using a player that could display artists and track names. During the Sony years, the company expanded into video; these days, one of Gracenote’s major pushes centers on embedded media systems in cars; White suggested that Gracenote is now used in about 50 million vehicles. To put that in perspective, that’s roughly a quarter of the new cars and light trucks that have been cumulatively sold in the United States since Gracenote first began working on car audio back in 2001, according to the auto industry analysts at Ward Auto.

Those numbers are nothing to sneeze at, and certainly open up a new market for the Tribune Company. However, there’s an elephant in the room: Apple’s “iOS in the Car” initiative, first announced at the WWDC conference in June 2014, which seeks to create a bridge between iOS mobile devices and a driver-safe in-dashboard user interface. Rumor suggests the platform will launch later this spring, at which point it might very quickly outmode other Gracenote-powered car media systems, thanks in no small part to Apple’s notorious perfectionism in interface design, as well as an (expected) plethora of iOS features.

Regardless of the Tribune Company’s specific plans for Gracenote’s datasets and technical infrastructure, it spent a hefty amount of cash on an entity devoted solely to compiling metadata about copyrightable works owned by third parties. In other words, Gracenote still commands a nine-figure price tag when its primary product, to put it bluntly, amounts to footnotes and annotations to media for which it doesn’t have licenses or rights. In 2007, the year before Gracenote was sold to Sony, the All Media Guide was purchased by Macrovision for a reported $72 million; Gracenote sold for more than twice as much. It was also the first major purchase of a metadata-hosting firm since NSA spooks made the word a familiar household term.

The “meta” in metadata is slowly fading away, however – in a world where everything is connected, all data is inherently referential. “For most people that distinction is lost,” White said. “We realized pretty early on that metadata was going to be, hugely important in a world where you’ve got this explosion of content, this explosion of connected devices that are trying to develop services and experiences around that content, and the only really way to do that is to understand what the content is.”

Case in point is a forthcoming project, known as Gracenote Rhythm, which is a personalized radio station service that compiles songs into coherent playlists; on paper, it seems a lot like Pandora with a RESTful JSON API. (In contrast, Pandora has kept their elaborate metadata system, known as the Music Genome Project, entirely proprietary.) “It leverages both editorial based and kind of machine based descriptors,” White explained, “We believe the combination of those two things, you know really differentiate the results from the perspective that they have, that human touch and that human element, that, is impossible to capture purely through machines, but then they have the scale, and the ability to cover over a hundred million tracks, which you can really only get through some of the machine learning and machine listening capabilities.” The latter curiously includes computer assessments of fundamentally human and subjective traits such as “mood,” which seems odd but was inevitable: “You have to classify mood at the track level; trying to do that editorially across a hundred million tracks is just an unscalable approach.”

It’s hard to imagine Gracenote’s custom radio stations making much of a splash, given how many alternative services already exist; but the personalization features depend on the compilation and retention of user profiles, which may have marketing value which may far exceed the simple transactional value, if any, of sending little blips of metadata to audio players and partner services. Which returns us to the realm of NSA spooks: while the titles of the songs in your playlists shouldn’t be conflated with records of your phone calls, services such as Rhythm may help Gracenote partially convert its library of media metadata into a library of user data. “We do have big hopes for that part of our business going forward,” confirmed White.

Ari Kamdar, an activist with the Electronic Frontier Foundation, also sees it as potent territory, albeit in a more skeptical sense. “We’re seeing, especially with the ad space, that companies are trying to get user information from all different sources, and it’s not just what brands are looking for anymore,” he said. “They’re trying to get location data, financial data, habits, family… so I’m not surprised that audio data could be one of the big facets. It could be something that paints a bigger, more in-depth picture about these people.” And if Gracenote’s data set seems a little narrowly focused, they just need to find the right partners: “Service providers, stores, and web sites are basically there to fill in the blanks.”

White seems concerned when discussing privacy issues. “All that stuff has to be done very carefully,” he said. “We’re strong believers that if you’re not using that content in a way that is providing direct and understandable value back to the consumer, then you shouldn’t be doing it, and you’ve got to ask the consumer and be able to explain to them why you’re collecting the data you are, and what you’re doing with it, and how they benefit.” For example, Beats Music, the new service launched by Dr. Dre and Jimmy Iovine last month, starts its intake process with a profile questionnaire: as The Verge noted, it’s “a warm welcome to the world of Beats, since most other music apps just plop you onto a New Releases page upon signing up.” By providing useful services built atop large sets of media metadata, these companies can simultaneously build treasure troves of user metadata which may prove much more lucrative. (On the other hand, Spotify broadcasts your listening habits into a public activity feed by default, which is handy for sharing songs with friends, but may result in unintended exposure for users who tend away from exhibitionist listening.)

“The dialogue probably started for a lot of wrong reasons,” White continued. “We don’t want metadata to become demonized as a term. But I think that’s all really positive for them to be thinking about and for them to understand; I think its a great thing for the consumer to be educated.”

These are all probably necessary pivots for a company facing stiff competition from Apple, Spotify, and even the intrinsic metadata tags included on most modern media files, which usually mean the simple textual track data supplied by the earliest versions of the CDDB service no longer need to be imported at all. But White points to Gracenote’s 16 billion queries per month, sometimes peaking around 800 million per day, from a worldwide userbase of 200 million people, and even plenty of regular old CD ripping operations in countries where Internet access is only now becoming commonplace. “If we were a search engine, we’d be the second largest search engine in the world,” he noted. All of which is to say that Gracenote’s Rhythm radio station service is one of the first with a developer API. The recent decrease in Gracenote’s price tag notwithstanding, there may still be a market for sophisticated media metadata and the associated tech infrastructure. Someone just has to build it first.

 

Image: Gracenote

Post a comment Your email address will not be published.