San Francisco: As part of its efforts toward building machines that learn like humans do, Facebook has announced a new project designed to automatically learn audio, textual, and visual representations from the data in publicly available videos uploaded to the social networking platform.
“By learning from videos spanning nearly every country and hundreds of languages, this project will not just help us continuously improve our core AI systems for applications like content recommendation and policy enforcement — it will enable entirely new experiences,” Facebook said in a blog post on Friday.
In the last couple of years, Facebook made substantial improvements in self-supervised learning across speech, vision, and language.
These advancements have made AI systems less dependent on labelled data sets — a fundamental bottleneck on the pace of AI innovation — so that AI can start understanding the world through vast amounts of observational data like humans do.
While announcing the new project called “Learning from Videos”, Facebook said that building AI that learns from publicly available videos will help it create machines that better analyse uncurated, real-world sights and sounds — not just examples that are part of a much smaller, hand-curated data set.
“Although we’ve just scratched the surface, using semi- and self-supervised learning on the videos uploaded to Facebook has already improved our computer vision and speech recognition systems,” Facebook said.
“Within six months of developing a state-of-the-art, self-supervised framework for video understanding, we’ve built and deployed an AI model in Instagram Reels’ recommendation system,” it added.
Facebook recently announced a new AI model that can learn from any random group of images on the Internet without the need for careful curation and labelling that goes into most computer vision training today.
Called SEER (Self-supERvised), the “self-supervised” computer vision model was fed on a billion random, unlabelled and uncurated public Instagram images.