Alibaba introduces open-source model for generating cinematic videos

Dmytro Dzhuhalyk Автор новин на Mezha.Media. Пишу про те, чим сам активно захоплююся, а саме технології, ігри та кіно.

27 August, 05:13 PM

Alibaba has unveiled a new AI model that can generate videos of people using “movie-level” audio. Wan2.2-S2V has 14 billion parameters and is available as open source on GitHub and other platforms.

The new model is capable of generating high-quality video from a single image or audio clip. The Wan2.2-S2V has versatile character animation capabilities that allow you to create videos with various framing options, including portrait, bust, and full perspective.

🚀Introducing Wan2.2-S2V — a 14B parameter model designed for film-grade, audio-driven human animation. 🎬Going beyond basic talking heads to deliver professional-level quality for film, TV, and digital content. And it’s open-source!

✨ Key features:
🔹 Long-video dynamic… pic.twitter.com/yTevJrDWl5
— Wan (@Alibaba_Wan) August 26, 2025

Alibaba says the model can dynamically generate character actions and environmental factors based on quick instructions. The finished videos can be in 480 or 720p resolution.

Wan2.2-S2V combines global text-driven motion control with small, local motions driven by sound, allowing for more natural-looking characters even in challenging situations.

The Chinese company notes that another key breakthrough of the model is its innovative frame processing technique. The model compresses frames of arbitrary length into a single compact presentation, which significantly reduces the computing power requirement. At the same time, the company does not specify how long the videos can be generated.

Artificial Intelligence Alibaba

Advert:

Alibaba introduces open-source model for generating cinematic videos

Top Discussion

Latest News

Partner news