Adi Renduchintala
@rendu_a
Applied Research Scientist @NVIDIA, former: Research Scientist @MetaAI, PhD @jhuclsp also lurking on Mastodon [email protected]
AI model post training is rapidly improving. The plot below (starting from the same base model) illustrates about 10 months of progress in the *open* post-training research. I’m not convinced that closed research can move as fast.
Transformers are still dominating the LLM scene but we show that higher throughput alternatives exist which are just as strong! Grateful to have a part in Nemotron-H Reasoning effort. 🙏 Technical report will be out soon, stay tuned!
👀 Nemotron-H tackles large-scale reasoning while maintaining speed -- with 4x the throughput of comparable transformer models.⚡ See how #NVIDIAResearch accomplished this using a hybrid Mamba-Transformer architecture, and model fine-tuning ➡️ nvda.ws/43PMrJm
Some people have said that OpenAI achieved state of the art results on the SWE-Bench Verified leaderboard with their codex model, but that's actually not quite correct, no matter how you measure it. A quick 🧵
NSF budgets slashed by 50%, ongoing grants cancelled, NSF staff drastically reduced, all 37 divisions abolished, and grants will now be reviewed by a political kommissar. How will that help technological leadership? linkedin.com/posts/yann-lec…
idk dude I come here and look at my feed, literally everyone on my following feed is subscribed to DOGE and not a single professional scientific researcher has noted that every division at the NSF was just abolished
Markets down 📉 NVIDIA LLM research 📈 Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models! Better or on-par accuracy compared to other similar open-sourced transformer models while being up to ✨ 3× faster at inference ✨ arxiv.org/abs/2504.03624
We are excited to release Llama-Nemotron-Ultra! This is a reasoning ON/OFF, dense 253B model. Open weights and post-training data. huggingface.co/nvidia/Llama-3… We started with llama-405B, changed it via NAS pruning then followed by reasoning-focused post-training: SFT + RL in FP8.
Nemotron-H: A family of Hybrid Mamba-Transformer LLMs. * Hybrid architecture means up to 3X faster at the same accuracy * Trained in FP8 * Great for VLMs * Weights and instruct versions to come soon. research.nvidia.com/labs/adlr/nemo…
This paper provides some really interesting insights: 1. Previously, people found that Qwen base models are particularly good at R1 training to show strong exploration skills. - This paper shows that there is no magic about Qwen base models. It's likely pre-trained with…
🪂Understanding R1-Zero-Like Training: A Critical Perspective * DeepSeek-V3-Base already exhibits "Aha moment" before RL-tuning?? * The ever-increasing output length in RL-tuning might be due to a BIAS in GRPO?? * Getting GRPO Done Right, we achieve a 7B AIME sota! 🧵 📜Full…
We are excited to release new Llama-Nemotron models. These models allow you to set reasoning ON/OFF during runtime. We also release all the post-training data under CC-BY-4! Try it now on build.nvidia.com/nvidia/llama-3… HF collection: huggingface.co/collections/nv…
DPO, SimPO, RPO, ... There are just too many **PO**s in the NLP/LLM world! 💥💥💥😲 If you wonder which PO truly works the best, how to make them even better, and their inter-connections, read our latest paper at arxiv.org/pdf/2502.00203 👇 (1/3)
Our team put together a unified mathematical framework to analyze popular model alignment algorithms. “Reward-aware Preference Optimization: A Unified Mathematical Framework for Model Alignment” arxiv.org/pdf/2502.00203
So collusion rings are still a thing at top conferences. @openreviewnet has all the data across all top tier ml conferences — what are they currently do to solve this. What top tier conf is willing to release data of bidding patterns so we can analyze the problem at scale.
🤔 Pre-training as Ilya knows it will end, but not for us. At NeurIPS, @ilyasut shared an insightful perspective: "pre-training as we know it will end". I fully agree that agents, synthetic data, and inference-time computing are critical breakthroughs for the superintelligence,…
🥳Once again I am looking for PhD students in 💬persuasion, 💬dialogues and ⛑️AI safety, to join our CHATS lab @Northeastern in Fall 2025! 🥳 Apply by 12/15 both in ECE and CS! Let’s build and break chatbots together🥳! khoury.northeastern.edu/apply/phd-appl…
New Anthropic research: Adding Error Bars to Evals. AI model evaluations don’t usually include statistics or uncertainty. We think they should. Read the blog post here: anthropic.com/research/stati…
lowkey love this blog by @EvMill it's a simple, well written explainer that will cause you no harm reading but everyone is up in the replies grandstanding on why its obvious if this is how you get people to improve their eval, so be it 😂
New Anthropic research: Adding Error Bars to Evals. AI model evaluations don’t usually include statistics or uncertainty. We think they should. Read the blog post here: anthropic.com/research/stati…
Llama-3.1-Nemotron-70B-Instruct model aligned by our team is now live on lmarena.ai leaderboard with overall rank 9. Everything used to create this model is public: code, data and reward model. HF checkpoint: huggingface.co/nvidia/Llama-3…
#nlproc #LLM anyone aware of work on teaching LLM assistants to ask follow-up questions when prompts are unclear/incorrect?