Google launched Gemini Omni at its annual I/O developer conference. This multimodal AI model generates and edits video from text, images, and audio.
Users can conversationally edit clips and combine media into new content. Google is rolling out the initial Omni Flash version to the Gemini app and YouTube Shorts.
The model understands physical concepts like gravity and kinetic energy to create realistic scenes. This launch targets a market opening following the discontinuation of OpenAI’s Sora model.