That interview is good background to understand the early March Adobe & Time-Warner announcement of "a strategic alliance to foster collaboration on the development of next generation video and rich media experiences." In the Beet.TV interview below, Adobe's Jennifer Taylor explains the alliance, mentioning planned collaboration on implementing Adobe's video ecosystem ('from planning to playback') and on digital rights management, metadata & search, and audience measurement & monetization. And on his blog John Dowdell fleshed out some aspects of the announcement that seemed a bit vague. It seems that DRM is wanted before HBO rolls out The Sopranos, The Wire, and Entourage, etc.
There's not much for regular users from Google or Adobe on this front quite yet, though you can find a Delve example of speech metadata exposed in Speech-to-Text metadata in web video. If you want to understand Adobe's video metadata pipeline, Dan Ebberts' recently posted an article at Adobe that nicely steps you through current metadata and speech features, XMP metadata in Creative Suite 4 Production Premium (see that article's comments for alternative method for After Effects).
There's more background on metadata in previous posts here.
Update: Contentinople notes that Gotuit Enables Video Mashups With Metadata:
"Video metadata management firm Gotuit is signing up media customers by enabling them to chop up, mix and match, and create interactive video mashups...
QuickKicks allows users to view video feeds from each game -- broken down into categories such as game highlights, goals, and saves -- and will include specific player highlights from each team. The site also includes the ability for users to create their custom playlists and share those playlists with friends.
While the ability to create video mashups isn't particularly new, Gotuit has taken a novel approach to the feature, by enabling content companies to use metadata to define where video clips start and stop.
In other words, rather than editing the full-length video of a soccer match into multiple smaller video files, Gotuit works by allowing content owners to create "clips" by marking start and end points within the larger video file. The publisher can then identify what's happening in those clips with certain pre-defined metadata tags, for easy search and discoverability"