Kaushik Shivakumar
@19kaushiks
Gemini @GoogleDeepMind BS and MS from @berkeley_eecs, opinions my own
Powered by native image generation ⚡
Starting today, we're testing new image editing features in Gemini like changing backgrounds, swapping objects, and more. They’re rolling out gradually to all users. We're continuing to improve image generation, so stay tuned.
Starting today, we're testing new image editing features in Gemini like changing backgrounds, swapping objects, and more. They’re rolling out gradually to all users. We're continuing to improve image generation, so stay tuned.
Some cool people did some cool things to get some cool results :)
Exciting Text-to-Image leaderboard update! Two new Imagen 4.0 models from @GoogleDeepMind just dropped: 🥇 Imagen 4.0 Ultra (v2) ties at #1 with @OpenAI’s GPT-Image-1 🥉 Imagen 4.0 (v2) lands strong at #3 Congrats to the Google Imagen team!
Exciting Text-to-Image leaderboard update! Two new Imagen 4.0 models from @GoogleDeepMind just dropped: 🥇 Imagen 4.0 Ultra (v2) ties at #1 with @OpenAI’s GPT-Image-1 🥉 Imagen 4.0 (v2) lands strong at #3 Congrats to the Google Imagen team!
Btw as an aside, we didn’t announce on Friday because we respected the IMO Board's original request that all AI labs share their results only after the official results had been verified by independent experts & the students had rightly received the acclamation they deserved
How does an AI model actually learn to see? 🤖 Learn about the tech behind native multimodality, how models reason over visual data like documents and video, and the future of proactive AI assistants with @OfficialLoganK and Gemini Model Behavior Product Lead, @AniBaddepudi. ↓…
We're imagining a future where Gemini can see what you see -- as @AniBaddepudi says, "Everything is vision" Catch Ani & @OfficialLoganK talking about Gemini's SOTA ability to understand videos, images, documents, how we got here and where we're going! youtube.com/watch?v=K4vXva…
Big welcome to @_mohansolo and others from the Windsurf team joining Deepmind : )
Our latest Gemini 2.5 Pro update is now in preview. It’s better at coding, reasoning, science + math, shows improved performance across key benchmarks (AIDER Polyglot, GPQA, HLE to name a few), and leads @lmarena_ai with a 24pt Elo score jump since the previous version. We also…
Excited to share what my team has been working on lately - Gemini diffusion! We bring diffusion to language modeling, yielding more power and blazing speeds! 🚀🚀🚀 Gemini diffusion is especially strong at coding. In this example the model generates at 2000 tokens/sec,…
We’ve developed Gemini Diffusion: our state-of-the-art text diffusion model. Instead of predicting text directly, it learns to generate outputs by refining noise, step-by-step. This helps it excel at coding and math, where it can iterate over solutions quickly. #GoogleIO
Gemini 2.5 Pro with "Deep Think", our models just keep getting more SOTA, more to share soon : )
Image Generation is now available in preview with increased rate limits for Gemini 2.0 Flash. ⚡️
Go build! developers.googleblog.com/en/generate-im…
Image Generation is now available in preview with increased rate limits for Gemini 2.0 Flash. ⚡️
We just made Gemini 2.5 Pro even better at coding with an improved @lmarena_ai ELO. Hope you’ll like it 🚢
Gemini 2.5 Pro just got an upgrade & is now even better at coding, with significant gains in front-end web dev, editing, and transformation. We also fixed a bunch of function calling issues that folks have been reporting, it should now be much more reliable. More details in 🧵
Best not to overindex on LMArena, but the nice part of all of our releases is you can try them yourself, for free, on ai.dev :)
🚨Breaking: @GoogleDeepMind’s latest Gemini-2.5-Pro is now ranked #1 across all LMArena leaderboards 🏆 Highlights: - #1 in all text arenas (Coding, Style Control, Creative Writing, etc) - #1 on the Vision leaderboard with a ~70 pts lead! - #1 on WebDev Arena, surpassing Claude…
Gemini 2.5 Pro just got an upgrade & is now even better at coding, with significant gains in front-end web dev, editing, and transformation. We also fixed a bunch of function calling issues that folks have been reporting, it should now be much more reliable. More details in 🧵
Inside The Waymo Factory Building A Robotaxi Future on.forbes.com/60182AxAj (Photo: Waymo)
Was finishing up college when GPT-4 came out. I remember spending an evening frantically searching for tasks that it *didn't* work on to convince myself that human intelligence still mattered 😅, good times.
goodbye, GPT-4. you kicked off a revolution. we will proudly keep your weights on a special hard drive to give to some historians in the future.
The Gemini 2.5 series is the world’s best model lineup 👍
Gemini 2.5 Flash is a great workhorse model for most tasks
Excited to introduce Gemini 2.5 Flash our most cost-efficient thinking model. We are once again at the frontier here. Pretty good well rounded performance.