verl project

@verl_project

Open RL library for LLMs. https://github.com/volcengine/verl Join us on http://verl-project.slack.com

Joined January 2025

5Following

1KFollowers

Pinned

verl project@verl_project · Apr 18

Remember the NoisyStudent topping ImageNet back in 2019🏆? Was it the last dance of noisy training? 🍻 Meet NoisyRollout, our new noisy training efforts in building stronger o1-like visual reasoners. ✨ With only 2.1k training data and zero additional training cost, it hits…

JJeff Dean@JeffDean · Nov 12, 2019

Nice new results from @GoogleAI researchers on improving the state-of-the-art on ImageNet! "We...train a...model on...ImageNet...& use it as a teacher to generate pseudo labels on 300M unlabeled images. We then train a larger...model on the...labeled & pseudo labeled images."

16.0K

Pinned

verl project Retweeted

Wei Liu@WeiLiu99 · May 22

“What is the answer of 1 + 1?” Large Reasoning Models (LRMs) may generate 1500+ tokens just to answer this trivial question. Too much thinking 🤯 Can LRMs be both Faster AND Stronger? Yes. Introducing LASER💥: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping…

141

28.0K

verl project Retweeted

Bairu Hou@hou_bairu · Apr 8

1/ Long chain-of-thought (CoT) reasoning boosts LLM performance—but with a computational overhead. Checkout our new paper, ThinkPrune, where we explore a simple question: To what extent can we cut the reasoning length while keep the quality? We show that by simply adding a hard…

121

15.0K

verl project Retweeted

Chenxin An@AnChancy46881 · Jun 20

# 🚨 4B open-recipe model beats Claude-4-Opus 🔓 100% open data, recipe, model weights and code. Introducing Polaris✨--a post-training recipe for scaling RL on advanced reasoning models. 🥳 Check out how we boost open-recipe reasoning models to incredible performance levels…

447

398

96.0K

verl project Retweeted

elvis@omarsar0 · Jul 8

MemAgent MemAgent-14B is trained on 32K-length documents with an 8K context window. Achieves >76% accuracy even at 3.5M tokens! That consistency is crazy! Here are my notes:

115

559

527

99.0K

verl project@verl_project · Jul 9

The 1st verl meetup will be held at ICML Vancouver on July 16th! Please join us if you will be there! lu.ma/0ek2nyao (onsite only) Featuring speakers from verl & SGLang dev team, plus @BeidiChen from @InfiniAILab and @jxwuyi from Ant RL Lab #verl #ICML #Vancouver

verl_project's tweet card. * Please join the waitlist. We will screen and admit manually due to capacity reasons. Join top researchers and engineers in reinforcement learning, reasoning…

1.0K

verl project@verl_project · Jul 6

If you're in Singapore on 7/11, do not miss this meetup! Talks from the verl community: - LLMs to optimize code performance on real-world repos & verl project updates @sivil_taram - Long-horizon LLM agent training with verl-agent @langfengq Link: lu.ma/e498qhsi

verl_project's tweet card. New registrants, please register at https://lu.ma/LFSG TikTok, with LF AI & Data and LF Edge, are proud to co-host Singapore Day 2025 on Friday, July 11, 2025…

3.0K

verl project Retweeted

Lang Feng@langfengq · Jun 6

Open-source "verl-agent" codebase is evolving fast⚡ A scalable, multi-turn reinforcement learning framework for training LLM/VLM-based agents — now with rich features! (see summary in image below🔽) 🚀 Try it out and train your own LLM agents 📎 GitHub: github.com/langfengQ/verl…

1.0K

verl project Retweeted

Infini-AI-Lab@InfiniAILab · Jun 18

🚀 Excited to introduce our latest work GRESO: a method that identifies and skips millions of uninformative prompts before rollout and achieves up to 2.0x wall-clock time speedup in training. More rollouts lead to better model performance, but they’re also a major bottleneck in…

164

145

30.0K

verl project Retweeted

Zhaochen Su@SuZhaochen0110 · Jun 8

🚀 Thrilled to unveil ReVisual-R1! Our 7B open-source MLLM achieves long, accurate & thoughtful reasoning! 🔥 SOTA on 9 key benchmarks! Including AIME24 (53.3) & MathVision (48.8). Overall +16.8% avg! 📈 📄 Paper: arxiv.org/pdf/2506.04207 💻 Code: github.com/CSfufu/Revisua…

135

8.0K

verl project Retweeted

Casper Hansen@casper_hansen_ · Jun 7

💥Async RL rollouts that are 75% faster than other async implementations - removing all synchronous parts of the rollout - a single step in multi-turn is independent and async of all other completions - completions can finish independent of other completions

1.0K

verl project@verl_project · Jun 7

DeepSeek 671b and Qwen3 236b support with Megatron backend is now available as preview in verl v0.4.0 🔥🔥🔥 We will continue optimizing MoE model performance down the road. DeepSeek 671b: verl.readthedocs.io/en/latest/perf… verl v0.4: github.com/volcengine/ver…

verl_project's tweet image. DeepSeek 671b and Qwen3 236b support with Megatron backend is now available as preview in verl v0.4.0 🔥🔥🔥
We will continue optimizing MoE model performance down the road.

DeepSeek 671b: verl.readthedocs.io/en/latest/perf…
verl v0.4: github.com/volcengine/ver…

105

6.0K

verl project Retweeted

Lambda@LambdaAPI · Jun 6

Distributed training on GPU clusters shouldn't be complex. Check out the latest blog on orchestrating reasoning agent training with RAGEN and @verl_project on our 1-Click Clusters, powered by @dstackai 🔗 lambda.ai/blog/agent-tra…

2.0K

verl project Retweeted

Dongfu Jiang@DongfuJiang · Jun 1

Introducing VerlTool - a unified and easy-to-extend tool agent training framework based on verl. Recently, there's been a growing trend toward training tool agents with reinforcement learning algorithms like GRPO and PPO. Representative works include SearchR1, ToRL, ReTool, and…

343

245

47.0K

verl project@verl_project · May 31

Multi-GPU LoRA RL is now available in verl! It enables 70B+ model RL with 8 GPUs in bf16. Getting started: verl.readthedocs.io/en/latest/adva… Credit to: Simon Huang, @vermouth1992 @stephenx_ @jiayi_pirate @LongTonyLian skepsun, Weitao Feng, Alexey Malakhov, and many in the community

6.0K

verl project Retweeted

Yujia Qin@TsingYoga · May 25

Guess it's the first open-source multi-turn e2e RL for GUI Agents from academia, and it's based on UI-TARS-1.5-7B. If you want to study multimodal Agent RL, it is a good startpoint~ arxiv.org/abs/2505.16282

437

373

56.0K

verl project Retweeted

NovaSky@NovaSkyAI · May 22

1/N Introducing SkyRL-SQL, a simple, data-efficient RL pipeline for Text-to-SQL that trains LLMs to interactively probe, refine, and verify SQL queries with a real database. 🚀 Early Result: trained on just ~600 samples, SkyRL-SQL-7B outperforms GPT-4o, o4-mini, and SFT model…

148

107

29.0K