Junlong Li

@lockonlvange

Incoming PhD @hkust | CS MS/BS student @sjtu1896 | Interning @deepseek_ai | Ex Intern MSRA @MSFTResearch

Shanghai, China

Joined May 2022

251Following

346Followers

Pinned

Junlong Li@lockonlvange · May 22

Reasoning models like O1/R1 are powerful but ... waste sooooooo many tokens in overthinking even simple questions like "1+1"! If you are also troubled by this, don't miss LASER! It makes the answers much shorter and even more accurate, beating various baselines.

WWei Liu@WeiLiu99 · May 22

“What is the answer of 1 + 1?” Large Reasoning Models (LRMs) may generate 1500+ tokens just to answer this trivial question. Too much thinking 🤯 Can LRMs be both Faster AND Stronger? Yes. Introducing LASER💥: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping…

685

Junlong Li@lockonlvange · Jul 15

Excited to share DreamOn—our latest work teaching diffusion LMs to dynamically expand and contract beyond fixed-size canvases!

ZZirui Wu @ACL2025 🇦🇹@WilliamZR7 · Jul 15

We present DreamOn: a simple yet effective method for variable-length generation in diffusion language models. Our approach boosts code infilling performance significantly and even catches up with oracle results.

2.0K

Junlong Li Retweeted

Zhihui Xie@_zhihuixie · Jul 15

🚀 Thrilled to announce Dream-Coder 7B — the most powerful open diffusion code  LLM to date.

112

11.0K

Junlong Li@lockonlvange · Jul 15

👇this nice guy❤️will help us present CodeI/O (arxiv.org/abs/2502.07316) at Oral session 6A Applications in Agents and Coding, Thu 17 Jul 4 p.m. — 4:15 p.m. PDT. Take a look if you are there and feel interested.

YYiheng Xu@yihengxu_ · Jul 14

Attending #ICML2025 🇨🇦 this week! Will be presenting Aguvis (arxiv.org/abs/2412.04454) on July 15 at 11am, and joining Computer Use Agent Workshop @workshopcua on July 19. If you’re into digital agent research, especially around computer/browser use, let’s grab a coffee!

1.0K

Junlong Li Retweeted

Zhaochen Su@SuZhaochen0110 · Jul 2

Excited to share our new survey on the reasoning paradigm shift from "Think with Text" to "Think with Image"! 🧠🖼️ Our work offers a roadmap for more powerful & aligned AI. 🚀 📜 Paper: arxiv.org/pdf/2506.23918 ⭐ GitHub (400+🌟): github.com/zhaochen0110/A…

161

15.0K

Junlong Li@lockonlvange · Jun 21

Glad to see @ChengZhoujun demystifing the data mixture recipe in massive domains in RLVR, great work, congrats!

ZZhoujun (Jorge) Cheng@ChengZhoujun · Jun 20

🤯What we know about RL for reasoning might not hold outside math and code? We revisit established findings on RL for LLM reasoning on six domains (Math, Code, Science, Logic, Simulation, Tabular) and found that previous conclusions drawn on math and code are surprisingly…

674

Junlong Li Retweeted

Tanishq Abraham back from ICML@iScienceLuvr · Jun 19

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective "We introduce GURU, a curated RL reasoning corpus of 92K verifiable examples spanning six reasoning domains—Math, Code, Science, Logic, Simulation, and Tabular—each built through domain-specific…

221

168

20.0K

Junlong Li@lockonlvange · Jun 18

🧵Interesting paper—great to see the emphasis on large token counts, which is always appreciated. 😅But some of the results are... puzzling. For example, Table 3 essentially suggests that MegaMath is a non-math corpus. This is weird, especially given the care we've taken during…

EEssential AI@essential_ai · Jun 18

[1/5] 🚀 Meet Essential-Web v1.0, a 24-trillion-token pre-training dataset with rich metadata built to effortlessly curate high-performing datasets across domains and use cases!

5.0K

Junlong Li@lockonlvange · Jun 2

We studied both rule-based and model-based verifiers and found that each has unique limitations. Rule-based verifiers are often unreliable, even in math, and are unavailable in many domains. Model-based verifiers can be easily hacked. In our paper, we construct simple…

YYuzhen Huang@yuzhenh17 · May 29

🔍 Are Verifiers Trustworthy in RLVR? Our paper, Pitfalls of Rule- and Model-based Verifiers, exposes the critical flaws in reinforcement learning verification for mathematical reasoning. 🔑 Key findings: 1️⃣ Rule-based verifiers miss correct answers, especially when presented in…

138

15.0K

Junlong Li Retweeted

Yuzhen Huang@yuzhenh17 · May 29

130

26.0K

Junlong Li Retweeted

Junteng Liu@junteng88716710 · May 29

Lack of RL Logical Reasoning data? Excited to share SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond 🚀 Building strong logical reasoning through RLVR 📄Paper: huggingface.co/papers/2505.19… 💻 Code: github.com/MiniMax-AI/Syn… (1/n)

169

114

18.0K

Junlong Li@lockonlvange · May 29

I really like this paper. I'd like to echo the point that RL-related conclusions should be drawn cautiously when using only Qwen models solely on math tasks. Our SimpleRL-Zoo paper is one of the few that actually conducts RLVR across diverse models: arxiv.org/abs/2503.18892

SStella Li ➡️ CogSci2025@StellaLisy · May 27

🤯 We cracked RLVR with... Random Rewards?! Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by: - Random rewards: +21% - Incorrect rewards: +25% - (FYI) Ground-truth rewards: + 28.8% How could this even work⁉️ Here's why: 🧵 Blogpost: tinyurl.com/spurious-rewar…

115

10.0K

Junlong Li Retweeted

Zhaochen Su@SuZhaochen0110 · May 14

Overall Introduction Exciting news! Enabling AI to truly "think" with images 🖼️🧠 is a huge challenge. We introduce OPENTHINKIMG, the first open-source E2E framework, powered by our proposed novel V-TOOLRL! Paper: arxiv.org/pdf/2505.08617 #AI #LVLM #RL #OpenSource

6.0K

Junlong Li Retweeted

Jun-Yan Zhu@junyanz89 · May 9

We've released the code for LegoGPT. This autoregressive model generates physically stable and buildable designs from text prompts, by integrating physics laws and assembly constraints into LLM training and inference. This work is led by PhD students @AvaLovelace0, @kangle_deng,…

509

239

37.0K

Junlong Li Retweeted

Zhengfu He@ZhengfuHe · Apr 30

Are attention heads the right units to mechanistically understand Transformers' attention behavior? Probably not due the attention superposition! We extracted interpretable attention units in LMs and found finer grained versions of many known and novel attention behaviors. 🧵1/N

524

454

41.0K

Junlong Li Retweeted

Junteng Liu@junteng88716710 · Apr 13

Excited to share our work studying CLIP and LLaVA on chart understanding tasks 🔍 As the CLIP vision encoder serves as LLaVA's visual information source, have you considered how the CLIP ability affects LLaVA? 📄Paper: arxiv.org/abs/2503.18435 📷Code: github.com/hkust-nlp/Visi… (1/5)

7.0K

Junlong Li Retweeted

Fan Zhou@FaZhou_998 · Apr 7

🥁🥁 Happy to share our latest efforts on math pre-training data, the MegaMath dataset! This is a 9-month project starting from 2024’s summer, and we finally deliver: the largest math pre-training data to date containing 💥370B 💥tokens of web, code, and synthetic data!

138

20.0K

Junlong Li Retweeted

Junxian He@junxian_he · Mar 25

Two months ago, we open-sourced the first R1-like zero RL training project on math with the Qwen2.5-math model. Since then, many great works performed successful zero RL training, mostly based on Qwen2.5 models. 🚀Now, we introduce SimpleRL-Zoo, a deep investigation of zero RL…

100

552

400

75.0K

Junlong Li Retweeted

KaShun SHUM@ksshumab_ · Mar 6

🚀Excited to introduce Predictive Data Selection (PreSelect): The Data That Predicts Is the Data That Teaches🚀 We find that data on which model losses are predictive of downstream abilities also contribute effectively to learning. Then we further propose predictive data…

197

172

67.0K

Junlong Li Retweeted

DeepSeek@deepseek_ai · Mar 1

🚀 Day 6 of #OpenSourceWeek: One More Thing – DeepSeek-V3/R1 Inference System Overview Optimized throughput and latency via: 🔧 Cross-node EP-powered batch scaling 🔄 Computation-communication overlap ⚖️ Load balancing Statistics of DeepSeek's Online Service: ⚡ 73.7k/14.8k…

790

1.0K

9.0K

2.0K

3.9M