Chenxin An
@AnChancy46881
PhD Candidate @ HKUNLP Awardee of Hong Kong PhD Fellowship Scheme
# 🚨 4B open-recipe model beats Claude-4-Opus 🔓 100% open data, recipe, model weights and code. Introducing Polaris✨--a post-training recipe for scaling RL on advanced reasoning models. 🥳 Check out how we boost open-recipe reasoning models to incredible performance levels…

amazing
🚀The era of overpriced, black-box coding assistants is OVER. Thrilled to lead the @Agentica_ team in open-sourcing and training DeepSWE—a SOTA software engineering agent trained end-to-end with @deepseek_ai -like RL on Qwen32B, hitting 59% on SWE-Bench-Verified and topping the…
Excited to share that Describe Anything has been accepted at ICCV 2025! 🎉 Describe Anything Model (DAM) is a powerful Multimodal LLM that generates detailed descriptions for user-specified regions in images or videos using points, boxes, scribbles, or masks. Open-source code,…
Nvidia just dropped Describe Anything on Hugging Face Detailed Localized Image and Video Captioning
What happend after Dream 7B? First, Dream-Coder 7B: A fully open diffusion LLM for code delivering strong performance, trained exclusively on public data. Plus, DreamOn cracks the variable-length generation problem! It enables code infilling that goes beyond a fixed canvas.
We present DreamOn: a simple yet effective method for variable-length generation in diffusion language models. Our approach boosts code infilling performance significantly and even catches up with oracle results.
📢 Update: Announcing Dream's next-phase development. - Dream-Coder 7B: A fully open diffusion LLM for code delivering strong performance, trained exclusively on public data. - DreamOn: targeting the variable-length generation problem in dLLM!
check out Dream-coder🔥
🚀 Thrilled to announce Dream-Coder 7B — the most powerful open diffusion code LLM to date.
Check out Reka Flash 3.1. It's finally open-source!🥳
Very excited to lead the continued pre-training (Math, Coding & Long-Context) and long-reasoning cold start & RL of this model. Proud moment seeing it go open source!🚀(1/2)
Reasoning can be made much, much faster—with fundamental changes in neural architecture. 😮 Introducing Phi4-mini-Flash-Reasoning: a 3.8B model that surpasses Phi4-mini-Reasoning on major reasoning tasks (AIME24/25, MATH500, GPQA-D), while delivering up-to 10× higher throughput…
Why does your RL training always collapse? In our new paper of RAGEN, we explore what breaks when you train LLM *Agents* with multi-turn reinforcement learning—and possibly how to fix it. 📄 github.com/RAGEN-AI/RAGEN… 🌐 ragen-ai.github.io 1/🧵👇
🎙️ Welcome to try MOSS-TTSD~ When we first heard our AI voices naturally chatting and even interrupting each other, the shock was indescribable. This isn't cold TTS anymore - it's dialogue with real warmth. Try it online! huggingface.co/spaces/fnlp/MO…
Polaris results are quite impressive! I converted the 4B to MLX. I'm not a fan of quantization on small models, but I created 4,5,6,8 and bf16. Enjoy!
# 🚨 4B open-recipe model beats Claude-4-Opus 🔓 100% open data, recipe, model weights and code. Introducing Polaris✨--a post-training recipe for scaling RL on advanced reasoning models. 🥳 Check out how we boost open-recipe reasoning models to incredible performance levels…
Excited to introduce our 7B Coding Diffusion LLM, DiffuCoder — advancing open-source diffusion models for high-quality code generation! Diffusion offers powerful global planning via iterative generation, and code is the perfect testbed to push its limits! #DiffusionModels
🤖Can diffusion models write code competitively? Excited to share our latest 7B coding diffusion LLM!!💻 With DiffuCoder, we explore how they decode, why temperature🔥 matters, and how to improve them via coupled-GRPO that speaks diffusion!!📈 Code: github.com/apple/ml-diffu… 🧵
🤖Can diffusion models write code competitively? Excited to share our latest 7B coding diffusion LLM!!💻 With DiffuCoder, we explore how they decode, why temperature🔥 matters, and how to improve them via coupled-GRPO that speaks diffusion!!📈 Code: github.com/apple/ml-diffu… 🧵
🚀 New work: OpenMOSS Embodied Planner-R1 - A step toward AI self-improvement in interactive planning! We've developed an RL framework where LLMs learn to plan through autonomous environmental exploration - no human demonstrations needed. 🤖 🧵 Thread below 👇
🚀 New milestone in ultra-long text generation! LongWriter-Zero uses pure RL (no SFT, no synthetic data) to produce ultra-long, coherent texts (10k+ words). Beats open-source models like DeepSeek-R1 and Qwen3-235B in many domains. 👉 huggingface.co/THU-KEG/LongWr…
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation Apple introduces DiffuCoder, a 7B diffusion LLM trained on 130B tokens of code authors also propose a diffusion-native RL training framework, coupled-GRPO Decoding of dLLMs differ from…
We release fully async RL system AReaL-boba² for LLM & SOTA code RL w. Qwen3-14B! @Alibaba_Qwen #opensource 🚀system&algorithm co-design → 2.77x faster ✅ 69.1 on LiveCodeBench 🔥 multi-turn RL ready 🔗 Project: github.com/inclusionAI/AR… 📄 Paper: arxiv.org/pdf/2505.24298 1/3👇
Some nice ablations confirming lots of what we know. A recurring theme I'm explaining these days is how the current RL regime is so much more complex technically than the previous post-training approaches were (at least in open research)
# 🚨 4B open-recipe model beats Claude-4-Opus 🔓 100% open data, recipe, model weights and code. Introducing Polaris✨--a post-training recipe for scaling RL on advanced reasoning models. 🥳 Check out how we boost open-recipe reasoning models to incredible performance levels…