Google announced a new version of its artificial intelligence for video generation, Veo 2. In addition, the company presented an updated Imagen 3 model for creating images and a new Whisk model, which uses other images instead of text prompts.
Veo 2 has “an improved understanding of real-world physics and the nuances of human movement and facial expressions.” Google promises that the model will “hallucinate less.” Veo 2 will also understand specialized prompts specifying genre, lens, and cinematic effect.
Google is rolling out Veo 2 to VideoFX (not available in Ukraine) and is "expanding the number of users who can access it," but there's still a waiting list. The model will be available to creators on YouTube Shorts and other products in 2025.
More examples can be found here .
The company has also improved its Imagen 3 image generation model. It now supports more artistic styles, from photorealism to impressionism, from abstraction to anime. The model responds even more accurately to cues and reproduces richer details and textures. Imagen 3 is also available exclusively in ImageFX.
And Google's latest innovation is the Whisk image generator, which allows you to specify reference images instead of text prompts, combine and mix images.
At its core, Whisk combines Imagen 3 with Gemini’s visual understanding capabilities. The Gemini model automatically writes detailed captions to the images it receives, then uploads these descriptions to Imagen 3.