rohan anil
@_arohan_
all about training algorithms & efficiency. @AnthropicAI Ex: Meta (2025), Google Deepmind, Google Brain, Sibyl (2013-2024). Views are my own.
A Saturday reminder to all new followers that Shampoo stands for a preconditioner. It’s called Shampoo because thats what comes pre/before using a conditioner.
With use of computer, docs, now llm. I think I am losing my ability to think with a pen and paper or at-least feel foreign to me. I wonder if I am losing access to some circuits in my brain to think better.
I’ve joined Cognition to continue to work on the future of software engineering. I was employee #2 at Windsurf and have worked on AI+code for years. There’s never been a more exciting time and place for it than now at Cognition. I had a place at Google DeepMind as part of the…
Dang! This is very good!
Compared the first version in our paper, this code removes problem specific hints completely. It just works!
Code release! 🚀 Following up on our IMO 2025 results with the public LLM Gemini 2.5 Pro — here’s the full pipeline & general (non-problem-specific) prompts. 👉 [github.com/lyang36/IMO25] Have fun exploring! #AI #Math #LLMs #IMO2025
🚨 Olympiad math + AI: We ran Google’s Gemini 2.5 Pro on the fresh IMO 2025 problems. With careful prompting and pipeline design, it solved 5 out of 6 — remarkable for tasks demanding deep insight and creativity. The model could win gold! 🥇 #AI #Math #LLMs #IMO2025
It's becoming more and more clear that Claude Code is the everything agent
HLE has recently become the benchmark to beat for frontier agents. We @FutureHouseSF took a closer look at the chem and bio questions and found about 30% of them are likely invalid based on our analysis and third-party PhD evaluations. 1/7
Really enjoyed reading this work! One way I tried to explain subliminal learning is drawing parallel to watermarking text which generally works by biasing generation at each step to a partition the partitioned token vocabulary (partitioning happens at every step using a private…
Paper authors: @cloud_kx @minhxle1 @jameschua_sg @BetleyJan @anna_sztyber @saprmarks & me. Arxiv pdf: arxiv.org/abs/2507.14805 Blogpost: alignment.anthropic.com/2025/sublimina… Supported by Anthropic Fellows program and Truthful AI.
Subliminal learning may be a general property of neural net learning. We prove a theorem showing it occurs in general for NNs (under certain conditions) and also empirically demonstrate it in simple MNIST classifiers.
New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵
In a joint paper with @OwainEvans_UK as part of the Anthropic Fellows Program, we study a surprising phenomenon: subliminal learning. Language models can transmit their traits to other models, even in what appears to be meaningless data. x.com/OwainEvans_UK/…
New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵
Everyone get your top 1% quality dataset and train 100 epochs right now