Dan Zhang
@DZhang50
Gemini Model+HW Codesign @ Google DeepMind | Computer Architecture PhD @ UT Austin🤘 | Opinions stated here are my own.
Currently, datacenter ML training and inference uses commodity TPU and GPU devices optimized for a wide range of workloads. Given the extreme scale of large datacenter deployments, would it be practical to build custom accelerators optimized for specific workloads? (1/4)
Very excited to announce that I’ll be co-organizing a @NeurIPSConf workshop on LLM evals! Identifying shortcomings in model capabilities in a robust, scientific way is a critical part of model development. Looking forward to discussing ideas and hearing from some eval experts!
We are happy to announce our @NeurIPSConf workshop on LLM evaluations! Mastering LLM evaluation is no longer optional -- it's fundamental to building reliable models. We'll tackle the field's most pressing evaluation challenges. For details: sites.google.com/corp/view/llm-…. 1/3
Excited to share that a scaled up version of Gemini DeepThink achieves gold-medal standard at the International Mathematical Olympiad. This result is official, and certified by the IMO organizers. Watch out this space, more to come soon! deepmind.google/discover/blog/…
Our IMO gold model is not just an "experimental reasoning" model. It is way more general purpose than anyone would have expected. This general deep think model is going to be shipped so stay tuned! 🔥
So happy to see this incredible achievement. Huge congrats to @lmthang, @quocleix, @YiTayML and the IMO team on the result. This was a great collaboration across teams to build a general Gemini DeepThink model that can also get gold at IMO.
Very excited to share that an advanced version of Gemini Deep Think is the first to have achieved gold-medal level in the International Mathematical Olympiad! 🏆, solving five out of six problems perfectly, as verified by the IMO organizers! It’s been a wild run to lead this…
Super thrilled to share that our AI has has now reached silver medalist level in Math at #imo2024 (1 point away from 🥇)! Since Jan, we now not only have a much stronger version of #AlphaGeometry, but also an entirely new system called #AlphaProof, capable of solving many more…
Tokenization is just a special case of "chunking" - building low-level data into high-level abstractions - which is in turn fundamental to intelligence. Our new architecture, which enables hierarchical *dynamic chunking*, is not only tokenizer-free, but simply scales better.
Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical network that replaces tokenization with a dynamic chunking process directly inside the model, automatically discovering and operating over meaningful units of data
Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical network that replaces tokenization with a dynamic chunking process directly inside the model, automatically discovering and operating over meaningful units of data
I converted one of my favorite talks I've given over the past year into a blog post. "On the Tradeoffs of SSMs and Transformers" (or: tokens are bullshit) In a few days, we'll release what I believe is the next major advance for architectures.
This is one of the most interesting papers ive read in a long time. not only in terms of token efficiency but also in terms of potential interesting latent interactions with the higher order trilinear representations. arxiv.org/abs/2507.02754
xai has free coffee on weekends; we have free tongsui on weekdays

do any effective altruists still follow "earn to give"? 80000hours.org/articles/earni…
Sam Altman says Meta is offering $100M signing bonuses to OpenAI staff. Not $100M annual compensation, just the signing bonus! He clowned Meta: “that’s not how you build a great culture.” Also said none of OpenAI’s best people are leaving. This AI talent war is crazy.
getting tired of winning 😎
new gemini 2.5 is out, and cost is so good vs o3 and opus anybody else getting model fatigue? 😂