Negin Raoof
@NeginRaoof_
Ph.D. student @Berkeley_EECS advised by @AlexGDimakis Ex: SWE @microsoft, collaborator @PyTorch
OpenThinker3-7B now outperforms Nvidia’s Nemotron-Nano-8B and DeepSeek-R1-Distill-Qwen-7B on reasoning benchmarks. It’s the strongest open-data reasoning model at the 7B scale 🧠 Today we’re releasing the full data curation recipe, dataset and model along with our paper. Huge…
Announcing OpenThinker3-7B, the new SOTA open-data 7B reasoning model: improving over DeepSeek-R1-Distill-Qwen-7B by 33% on average over code, science, and math evals. We also release our dataset, OpenThoughts3-1.2M, which is the best open reasoning dataset across all data…
Today, I’m launching a deeply personal project. I’m betting $100M that we can help computer scientists create more upside impact for humanity. Built for and by researchers, including @JeffDean & @jpineau1 on the board, @LaudeInstitute catalyzes research with real-world impact.
📢📢📢 Releasing OpenThinker3-1.5B, the top-performing SFT-only model at the 1B scale! 🚀 OpenThinker3-1.5B is a smaller version of our previous 7B model, trained on the same OpenThoughts3-1.2M dataset.
Evaluating agents on benchmarks is a pain. Each benchmark comes with its own harness, scoring scripts, and environments and integrating can take days. We're introducing the Terminal-Bench dataset registry to solve this problem. Think of it as the npm of agent benchmarks. Now…
We evaluated more than 1000 reasoning LLMs on 12 reasoning-focused benchmarks and made fascinating observations about cross-benchmark comparisons. You can explore all that data yourself on our HuggingFace spaces page. (1/4)
OpenThoughts3 is the #1 trending dataset on Huggingface! Thank you to everyone who is using the dataset and giving us great feedback 🚀!
Open weights, Open data, Open code -- SOTA reasoning model with only 7B parameters. Excited to see LlamaFactory powering its training 🥳
Announcing OpenThinker3-7B, the new SOTA open-data 7B reasoning model: improving over DeepSeek-R1-Distill-Qwen-7B by 33% on average over code, science, and math evals. We also release our dataset, OpenThoughts3-1.2M, which is the best open reasoning dataset across all data…
Very excited to finally release our paper for OpenThoughts! After DataComp and DCLM, this is the third large open dataset my group has been building in collaboration with the DataComp community. This time, the focus is on post-training, specifically reasoning data.
Amazing work by the drivers of this project 🥳 @etash_guha @ryanmart3n @sedrickkeh2 @NeginRaoof_
Paper: arxiv.org/abs/2506.04178 Model: huggingface.co/open-thoughts/… Dataset: huggingface.co/datasets/open-… Code: github.com/open-thoughts/… Blog: openthoughts.ai/blog/ot3 (10/N)
OpenThoughts3-1.2M and OpenThinker3-7B are a major milestone for open-data reasoning models, improving over DeepSeek-R1-Distill-Qwen-7B, as well as Nemotron-Nano-8B! Excited to be part of this work, and a thank you to everyone in the team!
Announcing OpenThinker3-7B, the new SOTA open-data 7B reasoning model: improving over DeepSeek-R1-Distill-Qwen-7B by 33% on average over code, science, and math evals. We also release our dataset, OpenThoughts3-1.2M, which is the best open reasoning dataset across all data…
I love how counterintuitive rigorous empirical research can be. We found that the best models (R1) aren't necessarily the best teachers (QwQ), and that scaling questions per answer is as efficient as scaling the number of questions. Great work team!
Announcing OpenThinker3-7B, the new SOTA open-data 7B reasoning model: improving over DeepSeek-R1-Distill-Qwen-7B by 33% on average over code, science, and math evals. We also release our dataset, OpenThoughts3-1.2M, which is the best open reasoning dataset across all data…
Congrats. one of the pioneering efforts on open reasoning models right now. Had no idea this was such a big team! For smaller models, distilling from R1 is the easiest path to performance. I'm more interested by the RL side (the work is more fun), but this is very impactful.
Announcing OpenThinker3-7B, the new SOTA open-data 7B reasoning model: improving over DeepSeek-R1-Distill-Qwen-7B by 33% on average over code, science, and math evals. We also release our dataset, OpenThoughts3-1.2M, which is the best open reasoning dataset across all data…
I'm excited to announce what we have been working on for months. Announcing OpenThinker3, the strongest 7B reasoning model with open data. Also more than 1000 experiments on what works and what doesn't for post-training data curation.
Announcing OpenThinker3-7B, the new SOTA open-data 7B reasoning model: improving over DeepSeek-R1-Distill-Qwen-7B by 33% on average over code, science, and math evals. We also release our dataset, OpenThoughts3-1.2M, which is the best open reasoning dataset across all data…
we created a SOTA reasoning dataset. a massive effort and an extremely fun time working with an awesome team :)
Announcing OpenThinker3-7B, the new SOTA open-data 7B reasoning model: improving over DeepSeek-R1-Distill-Qwen-7B by 33% on average over code, science, and math evals. We also release our dataset, OpenThoughts3-1.2M, which is the best open reasoning dataset across all data…
Many agents (Claude Code, Codex CLI) interact with the terminal to do valuable tasks, but do they currently work well enough to deploy en masse? We’re excited to introduce Terminal-Bench: An evaluation environment and benchmark for AI agents on real-world terminal tasks. Tl;dr…
Happy #WomeninMathematics day!
Happy #WomeninMathematics day! May 12 marks the birthday of Maryam Mirzakhani, a mathematician who was awarded the Fields medal (highest honor in math) for her contributions to geometry and dynamical systems. Two of my fav mathematicians: Maryam Mirzakhani & Ingrid Daubechies
Pretty pleased to see OpenThinker as a baseline in a frontier lab report like Phi-4 reasoning! Open data reasoning models are kicking strong. Thanks @MSFTResearch!
Had so much fun, thanks to @berkeley_csge!
1/ Berkeley AI is just next level OpenAI co-founder @johnschulman2 Perplexity co-founder @denisyarats Databricks co-founder @andykonwinski Bespoke Labs co-founder @AlexGDimakis All on one panel w/ audience of brilliant Berkeley AI grad student researchers My kinda Fri…
The Berkeley entrepreneurs student club has some cool alums that started a few small startups.
1/ Berkeley AI is just next level OpenAI co-founder @johnschulman2 Perplexity co-founder @denisyarats Databricks co-founder @andykonwinski Bespoke Labs co-founder @AlexGDimakis All on one panel w/ audience of brilliant Berkeley AI grad student researchers My kinda Fri…