Aditya Kusupati
@adityakusupati
Been places..... Done things.... Adaptive Compute for Gemini and beyond @GoogleDeepMind
Agents aren’t reliable. They don’t learn from experience. At @composiohq, we provide skills that evolve with your agents @lightspeedvp gave us $25M to make agents usable
Want to be part of a team redefining SOTA for generative video models? Excited about building models that can reach billions of users? The Veo team is hiring! We are looking for amazing researchers and engineers, in North America and Europe. Details below:
I cannot emphasize this enough: the system use no tools, no lean — text in, text out. And the more we scale inference compute, the more accurate the proofs get, while still reading like natural text.
Very excited to share that an advanced version of Gemini Deep Think is the first to have achieved gold-medal level in the International Mathematical Olympiad! 🏆, solving five out of six problems perfectly, as verified by the IMO organizers! It’s been a wild run to lead this…
Super thrilled to share that our AI has has now reached silver medalist level in Math at #imo2024 (1 point away from 🥇)! Since Jan, we now not only have a much stronger version of #AlphaGeometry, but also an entirely new system called #AlphaProof, capable of solving many more…
Can open-data models beat DINOv2? Today we release Franca, a fully open-sourced vision foundation model. Franca with ViT-G backbone matches (and often beats) proprietary models like SigLIPv2, CLIP, DINOv2 on various benchmarks setting a new standard for open-source research🧵
Check out our #ICML2025 poster on Matryoshka Quantization tomorrow, being presented by @puranjay1412 on behalf of all the authors.
Hi, I'll be presenting Matryoshka Quantization (arxiv.org/abs/2502.06786) on 16th July at #ICML2025 📍East Exhibition Hall A-B #3606 ⏲️ 11 AM - 1:30 PM
FYI, as @jainprateek_ points out, @puranjay1412 is applying to grad schools. x.com/jainprateek_/s…
Puranjay will present our poster on nested bitwise models or MatQuant, so if you are ICML and interested in the topic, do bother him :) Puranjay is going on the grad-school market this cycle. So if you are looking for a brilliant, hardworking student with good ML+LLM exposure,…
🪆 Matryoshka is extremely general & applicable to every component in our modern ML/DL stack. It can't get more fundamental that 🪆 in bit space to enable elastic quantization! Drop by the poster and say hi to Puranjay (on behalf of @pranavn1008 @JeffDean @jainprateek_ & me).
Hi, I'll be presenting Matryoshka Quantization (arxiv.org/abs/2502.06786) on 16th July at #ICML2025 📍East Exhibition Hall A-B #3606 ⏲️ 11 AM - 1:30 PM
Puranjay will present our poster on nested bitwise models or MatQuant, so if you are ICML and interested in the topic, do bother him :) Puranjay is going on the grad-school market this cycle. So if you are looking for a brilliant, hardworking student with good ML+LLM exposure,…
Hi, I'll be presenting Matryoshka Quantization (arxiv.org/abs/2502.06786) on 16th July at #ICML2025 📍East Exhibition Hall A-B #3606 ⏲️ 11 AM - 1:30 PM
Compression is the heart of intelligence From Occam to Kolmogorov—shorter programs=smarter representations Meet KARL: Kolmogorov-Approximating Representation Learning. Given an image, token budget T & target quality 𝜖 —KARL finds the smallest t≤T to reconstruct it within 𝜖🧵
Introducing MatFormer Lab for Gemma 3n 🧑🔬 Use Mix-n-Match to slice the E4B and create a model with a custom size between 2B and 4B effective parameters Explore the quality-size trade-off and share your models with the community Try it out: goo.gle/gemma3n-matfor…
Gemma 3n is powered by the MatFormer (Matryoshka Transformer) architecture, a model containing smaller, fully-functional submodels. Learn more 🧵↓
We're looking for people to join us to work on Gemini Diffusion and help revolutionize language modeling! Details below: job-boards.greenhouse.io/deepmind/jobs/…
Excited to share what my team has been working on lately - Gemini diffusion! We bring diffusion to language modeling, yielding more power and blazing speeds! 🚀🚀🚀 Gemini diffusion is especially strong at coding. In this example the model generates at 2000 tokens/sec,…
We have now traced down the main issue of poor quality in Gemma 3n MobileNet V5 to an incorrect conv layers HW dimension swapped in Jax to PyTorch checkpoint conversion, along with other minor issues. We are pushing out the new ckpt and expect it mostly recover model quality.
Really irresponsible of @UPS who has lost the tracking of my PASSPORT and isn't even investigating into this issue. Clearly opaque system and customer apathy. None of the customer support knows how to help me navigate the package which has objectively the most IMPORTANT document
.@UPS a ground transport package carrying my wife's passport from within SF to our house in SF has been missing for a week now. This is ridiculous It is her Passport & we need it asap. customer care has not been helpful in anyway Very dissapointed - need help asap @_keysarasara
Seeing performance issues with Gemma 3n's vision capabilities? We're investigating the cause This is not a model quality issue, but a bug in the MobileNet V5 vision encoder implementation in timm (and hence transformers) and MLX. We apologize for any confusion this has caused.…
Want to learn about the research behind Gemma 3n? Altup - arxiv.org/abs/2301.13310 LAuReL - arxiv.org/abs/2411.07501 MatFormer - arxiv.org/abs/2310.07707 Activation sparsity - arxiv.org/abs/2506.06644 Universal Speech Model - arxiv.org/abs/2303.01037 Blog - developers.googleblog.com/en/introducing…
We are hiring Technical Program Manager to organize and enable our research teams to be the best at what they do and to make fast-paced progress towards our mission of bringing AGI responsibly. Ideal candidates should have a demonstrable record of strong program management…