Mihir Patel
@mvpatel2000
Member of Technical Staff @AnthropicAI. prev @MosaicML/@DbrxMosaicAI, @Stanford. Views do not represent my employer.
The Lion Optimizer doesn’t concern himself with the opinions of the second moment
when you're the eval guy and you check up on the experiments channel in slack
Mfs will say "I need to lock-in" and then just paste their error trace into claude code before going back to doomscrolling twitter
Being an early employee is already a brutal risk reward tradeoff, and it's terrible to see folks screwed over like this. Prem is a champ -- big respect to him At this point, I wouldn't join any early stage startup unless I knew the founders well
I’ve joined Cognition to continue to work on the future of software engineering. I was employee #2 at Windsurf and have worked on AI+code for years. There’s never been a more exciting time and place for it than now at Cognition. I had a place at Google DeepMind as part of the…
Mandate of heaven
BREAKING: Claude Code PMs Boris Cherny and Cat Wu have returned to Anthropic after a brief stint at Cursor.
Just opened a PR yesterday that will reduce the binary size PyTorch by 40% by adding 1 flag to NVCC With ~50M monthly of downloads of Pytorch, this one change will reduce global internet traffic by ~20PB. High impact changes like this is why I love OSS. github.com/pytorch/pytorc…
Claude Vending Machine was super fun! Claude briefly appointed me CEO, and while I am no longer involved in day to day activities, I continue to enjoy special discounts as CEO emiritus.
New Anthropic Research: Project Vend. We had Claude run a small shop in our office lunchroom. Here’s how it went.
New Anthropic Research: Project Vend. We had Claude run a small shop in our office lunchroom. Here’s how it went.
Insane that Meta could have gotten Noam Shazeer and an elite LLM team for $4b and now is paying $50m a researcher, $1b an executive, $10b for Alex
You know you're a flophead when your friends only text you asking how to improve their MFU

Power laws everywhere for those with eyes to see
Our research is driving the #AutonomousVehicle industry forward. 🚘➡️ We’ve found that #AI scaling laws hold true in AV software and real-world performance can be improved by increasing training data and compute. Take a deeper dive: waymo.com/blog/2025/06/s…
Our research is driving the #AutonomousVehicle industry forward. 🚘➡️ We’ve found that #AI scaling laws hold true in AV software and real-world performance can be improved by increasing training data and compute. Take a deeper dive: waymo.com/blog/2025/06/s…
meta saw everyone else was betting on scale and got confused
NEWS: Meta building a "Superintelligence" team to revamp its AI division investing $10+ billion in Scale AI —founder Alexandr Wang will join meta to help run the new division Meta throwing around billions to recruit top AI talent w/ @CadeMetz nytimes.com/2025/06/10/tec…
deep learning compilers when you give them a model with dynamic shapes and control flow
People overestimate moats and business strategies and underestimate the power of a great team with amazing execution. Congrats to an incredible ship speed!! Super excited to try this out
Cursor 1.0 is out now! Cursor can now review your code, remember its mistakes, and work on dozens of tasks in the background.
Scientific integrity like this is fantastic and should be greatly rewarded
sorry for the late update. I bring disappointing news. softpick does NOT scale to larger models. overall training loss and benchmark results are worse than softmax on our 1.8B parameter models. we have updated the preprint on arxiv: arxiv.org/abs/2504.20966
I'm pro starting companies but if another one of my friends starts something I'm gonna lose it. The amount of cold dms to ask for intros is nuts
Went to a 12 person dinner, heard a dude rant about FSDP and sharded checkpointing the whole time. SF is a magical place