Zhuolin Yang

@lucas110550

Research Scientist @NVIDIA, Ph.D @UofIllinois. Words are my own.

Santa Clara

Joined February 2016

30Following

19Followers

Pinned

Zhuolin Yang@lucas110550 · Jun 17

With stronger SFT backbone, AceReason-Nemotron-1.1-7B significantly outperforms its predecessor and sets a record-high performance among Qwen2.5-7B-based reasoning models. 📄Report: arxiv.org/pdf/2506.13284 🤗Model: huggingface.co/nvidia/AceReas… 📚SFT Data: huggingface.co/datasets/nvidi…

WWei Ping@_weiping · Jun 17

Introducing AceReason-Nemotron 1.1 Our previous release, AceReason-Nemotron-1.0, introduced a stage-wise RL recipe that was applied sequentially to math-only and code-only prompts, demonstrating both high efficiency and strong effectiveness. Here, we systematically investigate…

2.0K

Pinned

Zhuolin Yang Retweeted

Wei Ping@_weiping · Jun 17

6.0K

Zhuolin Yang@lucas110550 · Jun 17

Our released evaluation toolkit can reproduce our AceReason-Nemotron models numbers (see below): AceReason-Nemotron-1.0-7B: LiveCodeBench (Avg@8): * [05/23-05/24]: 72.0; [06/24-01/25]: 54.2 * release set v5: 51.2; release set v6: 44.4 AIME (Avg@64): * AIME'24: 68.6; AIME'25:…

YYang Chen@ychenNLP · Jun 17

The first thing we did was to make sure the eval setup is correct! We spend a lot of time to make sure our eval can - accurately reproduce the DeepSeek-R1 numbers on AIME, LiveCodeBench - it's IMPOSSIBLE to track the RL progress without a good eval set up (e.g., we see AIME up…

1.0K

Zhuolin Yang Retweeted

Yang Chen@ychenNLP · Jun 17

📢We conduct a systematic study to demystify the synergy between SFT and RL for reasoning models. The result? We trained a 7B model - AceReason-Nemotron-1.1, significantly improved from version 1.0 on math and coding benchmarks. ✅AIME2025 (math): 53.6% -> 64.8% ✅LiveCodeBench…

205

159

17.0K