Fei Fang
@fangf07
Associate Professor, Carnegie Mellon University
✨ Did you know that NOT using all generated rollouts in GRPO can boost your reasoning LLM? Meet PODS! We down-sample rollouts and train on just a fraction, delivering notable gains over vanilla GRPO. (1/7)
We release fully async RL system AReaL-boba² for LLM & SOTA code RL w. Qwen3-14B! @Alibaba_Qwen #opensource 🚀system&algorithm co-design → 2.77x faster ✅ 69.1 on LiveCodeBench 🔥 multi-turn RL ready 🔗 Project: github.com/inclusionAI/AR… 📄 Paper: arxiv.org/pdf/2505.24298 1/3👇
I’m so proud of you!
Another life update!! 🎉 I’m joining @JHUCompSci as an Assistant Professor starting Fall 2026! Apply to work with me on reinforcement learning, foundation models, & human-centered AI. Let’s build better AI agents 🤖🙆♀️🦀 Before that, I’ll join @NYU_Courant as an Assistant…
Another life update!! 🎉 I’m joining @JHUCompSci as an Assistant Professor starting Fall 2026! Apply to work with me on reinforcement learning, foundation models, & human-centered AI. Let’s build better AI agents 🤖🙆♀️🦀 Before that, I’ll join @NYU_Courant as an Assistant…
Excited to be at #aamas2025 ! - My keynote talk at C-MAS workshop today: 2-2:45pm, Maquette A - Will attend panel at ALA workshop today: 4:30-5:30pm, Salon 2 - Siyu Liu (PhD advised by @___tiffanyb___ ) will present our joint paper on Friday 10:45am, Salon 3
Congratulations to @steph_milani for successfully defending her PhD thesis on “Human-Centered Reinforcement Learning”. Special thanks to her committee members Geoff Gordon @hongshenus @katjahofmann @OriolVinyalsML

Congratulations to @du_yinuo (co-advised by me and @CotyGonzalez12) for successfully defending her PhD thesis on “Human and AI Decision Making in Cyber Secuirty: A Multiagent Modeling Perspective”. Thanks to committee members Prashanth Rajivan @___tiffanyb___ Christian Lebiere

🎉 Milestone Release! AReaL-boba, our latest #RL system! github.com/inclusionAI/AR… #AI • data/code/model ALL🔥 #OPENSOURCE • Full #SGLang & 1.5x faster on 7B RL • SOTA 7B math reasoning: 61.9 AIME24 & 48.3 AIME25 • 200-sample 32B tuning match QwQ on AIME24 @Alibaba_Qwen 1/3 👇
Congratulations! I’m so proud of you, Lily!
Congratulations to my former PhD student Lily Xu @lilyxu0 for her PhD thesis "High-stakes decisions from low-quality data: AI decision-making for planetary health" being selected as runner-up for the doctoral dissertation award (VLDDA) at the AAMAS conference 2025 @AAMASconf !
Join us bweteen 9am-12pm on Dec 15th at #NeurIPS2024 (rooms 215-216) as we announce the winners of the Concordia Contest and hear talks from Sasha Vezhnevets, @lrhammond, and @ChandlerDSmith, with discussion from @fangf07, @jzl86, @Diyi_Yang, and José Hernández-Orallo.
Stephanie’s oral presentation today at GenAI4Health workshop! 12/14 | East Meeting Room 16 🕙 10:40am - Oral Presentation 🕐 1:00pm - Poster Session 🕓 4:20pm - Poster Session Paper: arxiv.org/abs/2405.19660 Workshop: genai4health.github.io
🇨🇦 Hi! I’m attending my last @NeurIPSConf as a PhD student, presenting Patient-Ψ at a few workshops. I'm on the job market, looking for TT faculty roles & post-docs. DM if you'd like to chat (or invite me to a party 🥳)!
Another exciting work to be presented at today’s poster session this afternoon!
Excited to share our poster on Multi-Agent Imitation Learning! 🚀 We explore what the right objective should be when agents can behave strategically—and surprisingly, even exact BC fails to provide any guarantees 🕙 Dec 13th 4:30–7:30 PM 📍 West Ballroom #6101 #NeurIPS2024
Welcome to our poster this afternoon!
We’ll be presenting our work on Global Rewards in Restless Bandits today at #NeurIPS2024 in the West Ballroom #5500 at 4:30 PM!
Do reviewers appropriately update their scores based on rebuttals? In this PLOS ONE paper with @theryanliu, Steven Jecmen, @fangf07, and Nihar Shah, we present a randomized controlled trial that suggests that, at least under certain conditions, they do. journals.plos.org/plosone/articl…
We are honored that our paper assignment algorithm is being used for NeurIPS 2024! Also glad to contribute to the analysis of the deployed assignment :) @YixuanEvenXu
Announcing the Results of the NeurIPS 2024 Experiment on Improving the Paper-Reviewer Assignment! blog.neurips.cc/2024/12/12/neu…
I’m honored to be invited to visit MIT ORC tomorrow and give a talk tomorrow 4:15-5:15pm on Multi-Systems and Machine Learning for Societal Challenges at MIT E51-149. Looking forward to it!
Stephanie has done amazing work in human-centered reinforcement learning and she is now on the job market!
🎇 I’m on the academic job market! I’m a PhD candidate at @mldcmu. My research tackles challenges that arise from the sequential nature of human-AI interaction. Toward this goal, my work involves: 🤖 reinforcement learning, 🧠 foundation models, and 👩💻 human-centered AI.…
Restless bandits have been used to model multi-armed bandit problems where arms have both states and rewards. While restless bandits have been applied to various domains, they critically rely on the assumption that rewards are separable into a sum of per-arm rewards. 1/