Google AudioSet aims to make sounds, from roars to boings, searchable
Google researchers have released a collection of 2 million-plus labeled audio snippets designed to spark innovation in the area of sound search.
The company earlier this month published a paper titled "AudioSet: An ontology and human-labeled dataset for audio events" that it hopes will combine with image recognition to strengthen overall search and identification capabilities that could be used in a wide variety of machine learning applications, including the automation of video captions that include sound effects. Google began work on the project last year.
Google has exploited its YouTube business to collect 2 million ten-second YouTube excerpts (totaling 5.8 thousand hours of audio) labeled with more than 500 sound categories to create its AudioSet. Categories start at high levels such as Human Sounds and Music, and then get more specific, such as Whistling and Music Genre.
To read this article in full or to leave a comment, please click here
