Zhiding Yu
@ZhidingYu
Working to make machines understand the world like human beings. Words are my own.
Thank you AK! Excited to introduce Eagle 2.5, NVIDIA’s latest vision-language model that brings strong long-context capabilities across both image and video understanding — all with just 8B parameters. Most existing VLMs struggle with high-res inputs and long video contexts.…
Nvidia presents Eagle 2.5! - A family of frontier VLMs for long-context multimodal learning - Eagle 2.5-8B matches the results of GPT-4o and Qwen2.5-VL-72B on long-video understanding
New tech report out! 🚀 Scaling Up RL: Unlocking Diverse Reasoning in LLMs via Prolonged Training An expanded version of our ProRL paper — now with more training insights and experimental details. Read it here 👉 arxiv.org/abs/2507.12507
Does RL truly expand a model’s reasoning🧠capabilities? Contrary to recent claims, the answer is yes—if you push RL training long enough! Introducing ProRL 😎, a novel training recipe that scales RL to >2k steps, empowering the world’s leading 1.5B reasoning model💥and offering…
🤖 How can we teach embodied agents to think before they act? 🚀 Introducing ThinkAct — a hierarchical Reasoning VLA framework with an MLLM for complex, slow reasoning and an action expert for fast, grounded execution. Slow think, fast act. 🧠⚡🤲
And today we have just opened sourced the Eagle 2.5 model huggingface.co/nvidia/Eagle2.… You are welcome to download and give a try! We will also open source the fine-tuning code for Eagle 2/2.5 soon at github.com/NVlabs/Eagle. Stay tuned.
I did not notice this until just now. Thank you @andimarafioti for the recommendation! Very glad that even though Eagle 2 is not our latest work, people still find it very useful.
I did not notice this until just now. Thank you @andimarafioti for the recommendation! Very glad that even though Eagle 2 is not our latest work, people still find it very useful.
The Eagle 2 paper from Nvidia is such a goldmine.
Come to the T4V Workshop this Thursday (June 12th) and check the latest development in Transformers!
@CVPR is around the corner!! Join us at the Workshop on T4V at #CVPR2025 with a great speaker lineup (@MikeShou1, @jw2yang4ai, @WenhuChen, @roeiherzig, Yuheng Li, Kristen Grauman) covering diverse topics! Website: sites.google.com/view/t4v-cvpr2… #CVPR #Transformer #Vision #T4V2025 #T4V
Document and Enterprise Intelligence is arguably one of the most important applications of VLMs and cloud services. NVIDIA VLM technologies help to build commercial grade models excelling in this area. The Eagle VLM Team, together with other colleagues at NVIDIA, are proud to be…
🥇Our NVIDIA Llama Nemotron Nano VL model is #1 on the OCRBench V2 leaderboard. Designed for advanced intelligent document processing and understanding, this model extracts diverse info from complex documents with precision, all on a single GPU. 📗 Get the technical details…
Does RL truly expand a model’s reasoning🧠capabilities? Contrary to recent claims, the answer is yes—if you push RL training long enough! Introducing ProRL 😎, a novel training recipe that scales RL to >2k steps, empowering the world’s leading 1.5B reasoning model💥and offering…
Cool paper from @nvidia Prior methods for training LLMs for tool use rely on imitation or distilled reasoning, limiting generalization. Nemotron-Research-Tool-N1 uses rule-based reinforcement learning. It trains models with binary rewards evaluating only tool call structure…
Check this super cool work done by our intern @ShaokunZhang1 - RL + Tool Using is the future of LLM Agent! Before joining NVIDIA, Shaokun was a contributor of the famous multi-agent workflow framework #AutoGen. Now, the age of agent learning is coming beyond workflow control!
Tool-using LLMs can learn to reason—without reasoning traces. 🔥 We present Nemotron-Research-Tool-N1, a family of tool-using reasoning LLMs trained entirely via rule-based reinforcement learning—no reasoning supervision, no distillation. 📄 Paper: arxiv.org/pdf/2505.00024 💻…
The 4th Workshop on Transformers for Vision (T4V) at CVPR 2025 is soliciting self-nominations for reviewers. If you're interested, please fill out this form: forms.gle/cJKkywCyFAboct… More information can be found on our website: sites.google.com/view/t4v-cvpr2…
“仗打完了,他们赚什么”
《投名状》是部极品神片 神片都有一个毛病 就是在上映时票房不怎么火 因为太复杂很多人没看懂 但时间深沉下来越来越火 仨老头可比晚清留下来的照片画像要吓人多了 这片子艺术成分很高 三四层楼那么高
Congrats @angli_ai and team!
The Simular team is proud to share: 🎉 𝗔𝗴𝗲𝗻𝘁 𝗦 has won the 𝗕𝗲𝘀𝘁 𝗣𝗮𝗽𝗲𝗿 𝗔𝘄𝗮𝗿𝗱 at the Agentic AI for Science Workshop at #ICLR2025 @iclr_conf! 🎉 It’s the first open-source computer-use agent, and the first to surpass 20% on OSWorld at at the time of its…