Run-Ze Fan

@Vfrz525_

Incoming PhD student @UMassAmherst. Research Assistant@GAIR Lab @sjtu1896. NLP/LLMs. | Prev @ucas1978

Boston

Joined March 2022

1KFollowing

969Followers

Pinned

Run-Ze Fan@Vfrz525_ · 22 h

🚨 New release: MegaScience The largest & highest-quality post-training dataset for scientific reasoning is now open-sourced (1.25M QA pairs)! 📈 Trained models outperform official Instruct baselines 🔬 Covers 7+ disciplines with university-level textbook-grade QA 📄 Paper:…

Vfrz525_'s tweet image. 🚨 New release: MegaScience
The largest &amp; highest-quality post-training dataset for scientific reasoning is now open-sourced (1.25M QA pairs)!
📈 Trained models outperform official Instruct baselines
🔬 Covers 7+ disciplines with university-level textbook-grade QA
📄 Paper:…

229

117

17.0K

Run-Ze Fan@Vfrz525_ · 19 h

When building MegaScience, we learned the hard way: 📈 Strong datasets need strong proxy models. Our data was too spicy 🌶️ for small models like Qwen2.5-1.5B & 3B—they just flopped. But once we tried Qwen3-14B and 30B… boom 💥, everything clicked. Kinda terrifying to think: if…

RRun-Ze Fan@Vfrz525_ · 22 h

777

Run-Ze Fan@Vfrz525_ · 21 h

Thank you very much for your sharing!!! Check out our resources: Datasets & Models: huggingface.co/MegaScience Code base: github.com/GAIR-NLP/MegaS… Scientific Evaluation System: github.com/GAIR-NLP/lm-op…

AAK@_akhaliq · 23 h

MegaScience Pushing the Frontiers of Post-Training Datasets for Science Reasoning

368

Run-Ze Fan Retweeted

Shiyu Ni@ACL 2025@Shictyu · Jul 16

🥳Happy to share that our paper "Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception" has been accepted by #ACL2025! We explore leveraging LLMs' internal states to improve their knowledge boundary perception from efficiency and risk perspectives.

722

Run-Ze Fan@Vfrz525_ · May 28

Good work. It seems that long cot pre-training has a significant impact on RL.

ZZengzhi Wang@SinclairWang1 · May 28

I believe that we need a deeper understanding of the relationship between pre-training and RL scaling. How to perform pre-training better, making language models more suitable and smooth for RL scaling? That is to say, Pre-training for RL. If you are interested in it, welcome to…

476

Run-Ze Fan Retweeted

Yanheng He@YanhengHe · May 22

🔥 Excited to share our work "Efficient Agent Training for Computer Use" Q: Do computer use agents need massive data or complex RL to excel? A: No, with just 312 high-quality trajectories, Qwen2.5-VL can outperform Claude 3.7, setting a new SOTA for Windows computer use. 1/6

189

200

23.0K

Run-Ze Fan@Vfrz525_ · Apr 23

Curious about the next paradigm shift in AI? Step into Generative AI Act II: Test Time Scaling Drives Cognition Engineering. We built this bilingual, research-meets-practical guide to Cognitive Engineering. Big ideas, tutorials, and code included. arxiv: arxiv.org/abs/2504.13828

SShijie Xia@ShijieX60925 · Apr 22

🔥 Happy to share our paper on test-time scaling (TTS)! 🚀 We take the position that generative AI has entered Act II, that is cognition engineering driven by TTS. 🛠️ We provide many valuable resources to help community utilize TTS to develop the cognitive ability of models.

601

Run-Ze Fan Retweeted

Fan Zhou@FaZhou_998 · Apr 7

🥁🥁 Happy to share our latest efforts on math pre-training data, the MegaMath dataset! This is a 9-month project starting from 2024’s summer, and we finally deliver: the largest math pre-training data to date containing 💥370B 💥tokens of web, code, and synthetic data!

139

20.0K

Run-Ze Fan Retweeted

Chenghao Yang@chrome1996 · Feb 25

1/🚀Excited to share our AI Realtor project! 🏡 We challenge AI to help sell real estate! This is an interdisciplinary agent work combining economy, personalization & LLM persuasion. More details are below! 👇

4.0K

Run-Ze Fan@Vfrz525_ · Feb 17

📢 Some Updates 📢 We are happy to further release DCLM-pro, a cleaner and larger prox corpus (currently around 500B level) based on the great DCLM-baseline. In our preliminaty training study, the downstream performance on it seems outperform the baseline by 1.7%. We think it…

FFan Zhou@FaZhou_998 · Sep 26

🚀 Still relying on human-crafted rules to improve pretraining data? Time to try Programming Every Example(ProX)! Our latest efforts use LMs to refine data with unprecedented accuracy, and brings up to 20x faster training in general and math domain! 👇 Curious about the details?

6.0K

Run-Ze Fan@Vfrz525_ · Feb 6

#LIMO: The "Less is More" Law in LLM Reasoning (1) 817 training data with 57.1% AIME: We discovered the "Less is More" law in complex reasoning: In the American Invitational Mathematics Examination (AIME), LIMO's accuracy soared from 6.5% (compared to traditional methods like…

YYixin Ye@BLeavesYe · Feb 6

🤔 How many examples does an LLM need to learn competition-level math? Conventional wisdom: 100,000+ examples Our discovery: Just 817 carefully chosen ones 🤩 With pure SFT, LIMO achieves: 57.1% on AIME 94.8% on MATH LIMO: Less is More for Reasoning 📝 🔗 arxiv.org/pdf/2502.03387

7.0K

Run-Ze Fan Retweeted

Aran Komatsuzaki@arankomatsuzaki · Jan 29

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Shows that: - RL generalizes in rule-based envs, esp. when trained with an outcome-based reward - SFT tends to memorize the training data and struggles to generalize OOD

150

928

838

75.0K

Run-Ze Fan Retweeted

Songlin Yang@SonglinYang4 · Jan 18

I've created slides for those curious about the recent rapid progress in linear attention: from linear attention to Lightning-Attention, Mamba2, DeltaNet, and TTT/Titans. Check it out here: sustcsonglin.github.io/assets/pdf/tal…

165

855

743

106.0K

Run-Ze Fan Retweeted

Yanheng He@YanhengHe · Dec 24

🤔Dreaming of AI agents that can handle complex work? Frustrated by the endless hunt for agent training data? 🚀Introducing PC Agent & PC Tracker, our human cognition transfer framework enabling AI to perform complex computer tasks: 📹 PC Tracker: the first lightweight…

1.0K

Run-Ze Fan@Vfrz525_ · Dec 25

Check out our scalable PC Agent data annotation framework (combining unsupervised and supervised data) and the training recipes we've developed.

JJiahe Jin@jiahe_Jin0123 · Dec 24

🤔 Struggling to train capable AI agents due to lack of quality data? 🚀 Meet PC Tracker & PC Agent - our groundbreaking system that learns from real human computer operation process to handle complex digital work! Watch how PC Agent automatically creates slides about Attention…

780

Run-Ze Fan Retweeted

Jiahe Jin@jiahe_Jin0123 · Dec 24

2.0K

Run-Ze Fan Retweeted

Manling Li@ManlingLi_ · Dec 20

[Long Tweet Ahead] Faculty Interview Tips & Common Questions: 🧘‍♀️0. Firstly, do not be nervous - Almost everything can be prepared in advance:) - Be grateful for everyone's time. - Think of it as an opportunity to share your research with others -- exciting, right? - Technical…

490

472

43.0K

Run-Ze Fan Retweeted

Jason Wei@_jasonwei · Nov 28

Sappy thanksgiving post: I believe humans are simply the product of their environment, and so I am thankful for the combined influence of the people I’ve worked with making me the researcher I am. A short history: Ryan Cotterell was the first really strong language AI researcher…

270

28.0K

Run-Ze Fan Retweeted

Jason Wei@_jasonwei · Nov 26

Prediction: within the next year there will be a pretty sharp transition of focus in AI from general user adoption to the ability to accelerate science and engineering. For the past two years it has been about user base and general adoption across the public. This is very…

172

1.0K

509

231.0K

Run-Ze Fan@Vfrz525_ · Nov 12

Interesting work!

YYuqing Yang@yyqcode · Nov 11

In Miami for #EMNLP2024! Come check out our findings poster, Weak-to-Strong Reasoning, on Wednesday at 10:30am. Super excited for my first in-person conference. Looking forward to connecting and chatting about reasoning, hallucination, self-correction, and all things LLMs! 🌴🌴

539