Google Research has developed an AI tool called VLOGGER. It can generate realistic videos of people talking, gesturing, and moving. And all this from just one photo, writes VentureBeat.

The created videos don’t look perfect yet. However, this is a significant leap in the ability to animate still images.

“In contrast to previous work, our method does not require training for each person, does not rely on face detection and cropping, generates the complete image (not just the face or the lips), and considers a broad spectrum of scenarios (e.g. visible torso or diverse subject identities) that are critical to correctly synthesize humans who communicate,” the authors explained.

To create the system, they used a dataset containing more than 800 thousand different identities and 2200 hours of video. This allowed VLOGGER to learn how to create videos about people of different ethnicities, ages, clothes, environments, etc.

Google sees VLOGGER as a step toward introducing “conversational agents” that can interact with people naturally through speech, gestures, and eye contact.

And while the technology opens up a number of potential applications, it also raises concerns about creating diplomatic fakes and spreading disinformation.

In addition, VLOGGER still has limitations. The generated videos are relatively short and have a static background.