Aditya Kusupati

@adityakusupati

Been places..... Done things.... Adaptive Compute for Gemini and beyond @GoogleDeepMind

San Francisco, CA

Joined March 2012

2KFollowing

5KFollowers

Aditya Kusupati Retweeted

Karan Vaidya@KaranVaidya6 · Jul 22

Agents aren’t reliable. They don’t learn from experience. At @composiohq, we provide skills that evolve with your agents @lightspeedvp gave us $25M to make agents usable

230

148

1.0K

448

536.0K

Aditya Kusupati Retweeted

Dumitru Erhan@doomie · Jul 22

Want to be part of a team redefining SOTA for generative video models? Excited about building models that can reach billions of users? The Veo team is hiring! We are looking for amazing researchers and engineers, in North America and Europe. Details below:

209

58.0K

Aditya Kusupati Retweeted

Archit Sharma@archit_sharma97 · Jul 21

I cannot emphasize this enough: the system use no tools, no lean — text in, text out. And the more we scale inference compute, the more accurate the proofs get, while still reading like natural text.

2.0K

Aditya Kusupati@adityakusupati · Jul 21

Very excited to share that an advanced version of Gemini Deep Think is the first to have achieved gold-medal level in the International Mathematical Olympiad! 🏆, solving five out of six problems perfectly, as verified by the IMO organizers! It’s been a wild run to lead this…

TThang Luong@lmthang · Jul 25, 2024

Super thrilled to share that our AI has has now reached silver medalist level in Math at #imo2024 (1 point away from 🥇)! Since Jan, we now not only have a much stronger version of #AlphaGeometry, but also an entirely new system called #AlphaProof, capable of solving many more…

226

2.0K

227

384.0K

Aditya Kusupati Retweeted

Shashank@shawshank_v · Jul 21

Can open-data models beat DINOv2? Today we release Franca, a fully open-sourced vision foundation model. Franca with ViT-G backbone matches (and often beats) proprietary models like SigLIPv2, CLIP, DINOv2 on various benchmarks setting a new standard for open-source research🧵

260

193

40.0K

Aditya Kusupati@adityakusupati · Jul 15

Check out our #ICML2025 poster on Matryoshka Quantization tomorrow, being presented by @puranjay1412 on behalf of all the authors.

PPURANJAY DATTA@puranjay1412 · Jul 15

Hi, I'll be presenting Matryoshka Quantization (arxiv.org/abs/2502.06786) on 16th July at #ICML2025 📍East Exhibition Hall A-B #3606 ⏲️ 11 AM - 1:30 PM

167

29.0K

Aditya Kusupati@adityakusupati · Jul 15

FYI, as @jainprateek_ points out, @puranjay1412 is applying to grad schools. x.com/jainprateek_/s…

PPrateek Jain@jainprateek_ · Jul 15

Puranjay will present our poster on nested bitwise models or MatQuant, so if you are ICML and interested in the topic, do bother him :) Puranjay is going on the grad-school market this cycle. So if you are looking for a brilliant, hardworking student with good ML+LLM exposure,…

17.0K

Aditya Kusupati@adityakusupati · Jul 15

🪆 Matryoshka is extremely general & applicable to every component in our modern ML/DL stack. It can't get more fundamental that 🪆 in bit space to enable elastic quantization! Drop by the poster and say hi to Puranjay (on behalf of @pranavn1008 @JeffDean @jainprateek_ & me).

PPURANJAY DATTA@puranjay1412 · Jul 15

Hi, I'll be presenting Matryoshka Quantization (arxiv.org/abs/2502.06786) on 16th July at #ICML2025 📍East Exhibition Hall A-B #3606 ⏲️ 11 AM - 1:30 PM

5.0K

Aditya Kusupati@adityakusupati · Jul 15

PPURANJAY DATTA@puranjay1412 · Jul 15

Hi, I'll be presenting Matryoshka Quantization (arxiv.org/abs/2502.06786) on 16th July at #ICML2025 📍East Exhibition Hall A-B #3606 ⏲️ 11 AM - 1:30 PM

20.0K

Aditya Kusupati Retweeted

Shivam Duggal@ShivamDuggal4 · Jul 11

Compression is the heart of intelligence From Occam to Kolmogorov—shorter programs=smarter representations Meet KARL: Kolmogorov-Approximating Representation Learning. Given an image, token budget T & target quality 𝜖 —KARL finds the smallest t≤T to reconstruct it within 𝜖🧵

344

268

51.0K

Aditya Kusupati Retweeted

Omar Sanseviero@osanseviero · Jul 8

Introducing MatFormer Lab for Gemma 3n 🧑‍🔬 Use Mix-n-Match to slice the E4B and create a model with a custom size between 2B and 4B effective parameters Explore the quality-size trade-off and share your models with the community Try it out: goo.gle/gemma3n-matfor…

203

16.0K

Aditya Kusupati Retweeted

Google AI Developers@googleaidevs · Jul 7

Gemma 3n is powered by the MatFormer (Matryoshka Transformer) architecture, a model containing smaller, fully-functional submodels. Learn more 🧵↓

674

242

43.0K

Aditya Kusupati@adityakusupati · Jul 4

We're looking for people to join us to work on Gemini Diffusion and help revolutionize language modeling! Details below: job-boards.greenhouse.io/deepmind/jobs/…

BBrendan O'Donoghue@bodonoghue85 · May 20

Excited to share what my team has been working on lately - Gemini diffusion! We bring diffusion to language modeling, yielding more power and blazing speeds! 🚀🚀🚀 Gemini diffusion is especially strong at coding. In this example the model generates at 2000 tokens/sec,…

268

121

38.0K

Aditya Kusupati Retweeted

Danfeng Qin@rise24546323 · Jul 3

We have now traced down the main issue of poor quality in Gemma 3n MobileNet V5 to an incorrect conv layers HW dimension swapped in Jax to PyTorch checkpoint conversion, along with other minor issues. We are pushing out the new ckpt and expect it mostly recover model quality.

4.0K

Aditya Kusupati Retweeted

Sanjana Sharma@_keysarasara · Jul 3

Really irresponsible of @UPS who has lost the tracking of my PASSPORT and isn't even investigating into this issue. Clearly opaque system and customer apathy. None of the customer support knows how to help me navigate the package which has objectively the most IMPORTANT document

2.0K

Aditya Kusupati@adityakusupati · Jul 3

.@UPS a ground transport package carrying my wife's passport from within SF to our house in SF has been missing for a week now. This is ridiculous It is her Passport & we need it asap. customer care has not been helpful in anyway Very dissapointed - need help asap @_keysarasara

3.0K

Aditya Kusupati Retweeted

Omar Sanseviero@osanseviero · Jul 3

Seeing performance issues with Gemma 3n's vision capabilities? We're investigating the cause This is not a model quality issue, but a bug in the MobileNet V5 vision encoder implementation in timm (and hence transformers) and MLX. We apologize for any confusion this has caused.…

276

34.0K

Aditya Kusupati Retweeted

Andrew Drozdov@mrdrozdov · Jul 3

2.0K

Aditya Kusupati Retweeted

Omar Sanseviero@osanseviero · Jul 1

Want to learn about the research behind Gemma 3n? Altup - arxiv.org/abs/2301.13310 LAuReL - arxiv.org/abs/2411.07501 MatFormer - arxiv.org/abs/2310.07707 Activation sparsity - arxiv.org/abs/2506.06644 Universal Speech Model - arxiv.org/abs/2303.01037 Blog - developers.googleblog.com/en/introducing…

115

606

717

47.0K

Aditya Kusupati Retweeted

Prateek Jain@jainprateek_ · Jun 30

We are hiring Technical Program Manager to organize and enable our research teams to be the best at what they do and to make fast-paced progress towards our mission of bringing AGI responsibly. Ideal candidates should have a demonstrable record of strong program management…

15.0K