Yan Ma
@ManTle_Ma
CS Ph.D. student @FudanUniversity
Huge thanks again to AK for sharing our work! 🙏 We're thrilled to announce that our code, data, and models are all open-sourced! 🚀 Dive in and try them out: Code: github.com/MiniMax-AI/One… 💻 Data: huggingface.co/datasets/One-R… 📊 Models: huggingface.co/One-RL-to-See-… 🤖 Feel free to…
One RL to See Them All Visual Triple Unified Reinforcement Learning
🚨 New release: MegaScience The largest & highest-quality post-training dataset for scientific reasoning is now open-sourced (1.25M QA pairs)! 📈 Trained models outperform official Instruct baselines 🔬 Covers 7+ disciplines with university-level textbook-grade QA 📄 Paper:…
FacTool has been accepted to COLM 2025 - two years after its arXiv debut! While the landscape of LLMs has changed a lot since then, tool-augmented LLMs and RAG are still among the most effective and practical approaches for detecting / mitigating hallucinations (ref:…
In the era of 🤖#GenerativeAI, text of all forms can be generated by LLMs. How can we identify and rectify *factual errors* in the generated output? We introduce FacTool, a framework for factuality detection in Generative AI. Website: ethanc111.github.io/factool_websit… (1/n)
Checkout our new survey, the next paradigm for multimodal chain of thoughts.🔥🔥
Excited to share our new survey on the reasoning paradigm shift from "Think with Text" to "Think with Image"! 🧠🖼️ Our work offers a roadmap for more powerful & aligned AI. 🚀 📜 Paper: arxiv.org/pdf/2506.23918 ⭐ GitHub (400+🌟): github.com/zhaochen0110/A…
What Makes a Base Language Model Suitable for RL? Rumors in the community say RL (i.e., RLVR) on LLMs is full of “mysteries”: (1) Is the magic only happening on Qwen + Math? (2) Does the "aha moment" only spark during math reasoning? (3) Is evaluation hiding some tricky traps?…
I had previously misconfigured the 3D RoPE for QwenVL within Verl. Correcting this issue led to a notable improvement in the stability. It appears that position ID plays a significant role in the stability of RL post-training for VLMs.

It's an honor to be part of this great work.
Day 1/5 of #MiniMaxWeek: We’re open-sourcing MiniMax-M1, our latest LLM — setting new standards in long-context reasoning. - World’s longest context window: 1M-token input, 80k-token output - State-of-the-art agentic use among open-source models - RL at unmatched efficiency:…
To further boost the "think with images" community, we've systematically summarized the latest research in our new repository: github.com/zhaochen0110/A… 🧠🖼️Let's make LVLMs see & think! A comprehensive survey paper will be released soon! Stay tuned.
🥲 Just survived the COLM rebuttal. I asked Gemini one tough question, and it replied: NIONIONIONIONIONIONIONIONIONIONION... Again. And again. And again. Looks like I broke it. 🧠💥 → 🧅🧅🧅🧅 #COLM2025 #RebuttalSeason #LLMmeltdown #GeminiBreakdown #PromptTooStrong

🧐When do LLMs admit their mistakes when they should know better? In our new paper, we define this behavior as retraction: the model indicates that its generated answer was wrong. LLMs can retract—but they rarely do.🤯 arxiv.org/abs/2505.16170 👇🧵
What if AI could mentally sketch its thoughts, just like you? The missing piece of AI multimodal reasoning is here! What if AI could daydream in images? Closing the imagination gap between humans and AI! Introducing Thinking with Generated Images — a new paradigm where large…
Lack of RL Logical Reasoning data? Excited to share SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond 🚀 Building strong logical reasoning through RLVR 📄Paper: huggingface.co/papers/2505.19… 💻 Code: github.com/MiniMax-AI/Syn… (1/n)
One RL to See Them All Visual Triple Unified Reinforcement Learning
🔥 Excited to share our work "Efficient Agent Training for Computer Use" Q: Do computer use agents need massive data or complex RL to excel? A: No, with just 312 high-quality trajectories, Qwen2.5-VL can outperform Claude 3.7, setting a new SOTA for Windows computer use. 1/6
Finally broke 100 citations! 🎉 I might be the slowest one in our group @ GAIR Lab to hit 100 citations though 😅#Research #AcademicGrowth #Milestone

🔥See here!
I really love alphaXiv since it transforms arXiv to a community in a simple way @askalphaxiv . BTW, happy to see Cognition Engineering catch 700+ likes within one week. We have also updated a new version based on community feedback.