February 17, 2009

Speech-to-Text metadata in web video

Google Labs has its GAUDI speech-to-text program as reported by Beet.TV. Announced last June, it's only been used on some political speech videos. Maybe Google is waiting for some backend changes before they expand this feature on YouTube to improve search and captions & subtitles.

Adobe execs also dropped word on their own approach to some fanfare in May and August (again these are on Beet.TV). Adobe has even shown some of it at recent CS4 Road Shows, but it seems we'll have to wait for the Adobe Labs project just a little longer.

Hopefully, Adobe will have solution someday that's a bit more automated than what's shown in Dan Ebberts' article posted yesterday at Adobe, XMP metadata in Creative Suite 4 Production Premium (via Kopriva). Dan nicely steps you through metadata and speech features, if you want to understand the video metadata pipeline, though a final example video didn't seem to be posted. There's more info and tutorials on Adobe metatdata listed at the end of Dan's article and in previous posts here.

Delve Networks posted an experiment a few weeks ago using President Obama's inaugural speech in a Flash player. You can type what you’re looking for into the player searchbar below. When you mouse over the "heatmap" you'll see clickable tags related to your topic; the interface seems to be better than word meaning relations.

Update: Beet.TV says, Believe It: Transcriptions Will Drive Online Video Consumption and MSNBC.com is Paving the Way.

Update 2: via John Dowdell, it seems that Adobe's work on this for programmers is at XMP Library for ActionScript on Adobe Labs. And coming full circle, back at Dan Ebberts' article mentioned above, Todd Kopriva comments,

"Gunar Penikis has recently announced the availability of an XMP library for ActionScript, which can be used to directly access XMP metadata in FLV and F4V files:


This library provides an alternative to the method that Dan's tutorial outlines, which uses an After Effects script to convert XMP metadata to cue points, which are then operated on by ActionScript code in the video player."


Todd Kopriva said...

But wait! There's more!

searchable video and XMP metadata: another way

Rich said...

Thanks for the alert!

Red said...

Does anybody know of a tutorial for how the Delve Networks made the player above?