Antoine Yang
@AntoineYang2
Senior Research Scientist @GoogleDeepMind, Gemini video 💎. Prev: PhD @Inria & @ENS_ULM, MEng @Polytechnique.
Thrilled to share our latest advances in video understanding 📽️: Gemini 2.5 Pro is a truly magical model to play with, excelling in traditional video analysis and unlocking new use cases I could not imagine a few months ago🪄 More in 🧵 and @Google blog: developers.googleblog.com/en/gemini-2-5-…
Video, meet audio. 🎥🤝🔊 With Veo 3, our new state-of-the-art generative video model, you can add soundtracks to clips you make. Create talking characters, include sound effects, and more while developing videos in a range of cinematic styles. 🧵
The results are in, and they're revealing. Only Gemini 2.5 pro handles 1-hour-long videos. Performance drops sharply with duration, proving that long video understanding is still challenging. We've found the breaking points—now the community can start fixing them.📈
The plot thickens. 🕵️ Use Gemini 2.5 Pro to turn random videos from your camera roll into a dramatic narrative. Try it yourself: 1️⃣ Go to Gemini and select “2.5 Pro” from the model dropdown. 2️⃣ Tap the plus icon and upload three videos from your camera roll. (Pro/Ultra…
We just shipped video FPS support in the Gemini API, so you can dynamically customize how many frames per second you want the model to see, unlocking lots of interesting new video use cases! 📹
Come to our poster today at #CVPR2025! 📍 Poster Session 4 #301 | ExHall D 📅 June 14, 17:00-19:00 📄 Paper: arxiv.org/abs/2504.00072 🔗 Project: imagine.enpc.fr/~lucas.ventura… 💻 Code: github.com/lucas-ventura/…
Introducing Chapter-Llama [#CVPR2025], a framework for 𝐯𝐢𝐝𝐞𝐨 𝐜𝐡𝐚𝐩𝐭𝐞𝐫𝐢𝐧𝐠 using Large Language Models! 🎬🦙 Check it out: 📄 Paper: arxiv.org/abs/2504.00072 🔗 Project: imagine.enpc.fr/~lucas.ventura… 💻 Code: github.com/lucas-ventura/… 🤗 Demo: huggingface.co/spaces/lucas-v…
Gemini 2.5 Pro - our most intelligent model, is getting an update before general availability. ✨ It’s even better at: coding 🖥️, reasoning 💡, and creative writing ✍️ Learn more. 🧵
By popular request, you can now specify frames per second (fps), as well as start and end times, for videos in AI Studio ⏩

Our native audio capabilities are making AI conversations more natural – from understanding tone to generating expressive speech. ✍️🗣️ This could open up new possibilities for how we interact with AI. Developers, try it through @Google AI Studio. Learn more. ↓…
used gemini 2.5 pro to build a simple shot counter for myself + give jordan feedback per shot.
Last year, we introduced Project Astra: a research prototype exploring capabilities for a universal AI assistant. 🤝 We’ve been making it even better with improved voice output, memory and computer control - so it can be more personalized and proactive. Take a look ↓ #GoogleIO
We fixed a few bugs regarding the "low media resolution" option in the Gemini API in the last few days, you should now be able to play with >2H videos within the 1M context window with Gemini 2.5 Pro. Please keep the feedback coming, we'll keep fixing 🚣♂️
