Yan Ma

@ManTle_Ma

CS Ph.D. student @FudanUniversity

Joined February 2024

67Following

63Followers

Pinned

Yan Ma@ManTle_Ma · May 31

Huge thanks again to AK for sharing our work! 🙏 We're thrilled to announce that our code, data, and models are all open-sourced! 🚀 Dive in and try them out: Code: github.com/MiniMax-AI/One… 💻 Data: huggingface.co/datasets/One-R… 📊 Models: huggingface.co/One-RL-to-See-… 🤖 Feel free to…

AAK@_akhaliq · May 26

One RL to See Them All Visual Triple Unified Reinforcement Learning

3.0K

Yan Ma Retweeted

Run-Ze Fan@Vfrz525_ · Jul 23

🚨 New release: MegaScience The largest & highest-quality post-training dataset for scientific reasoning is now open-sourced (1.25M QA pairs)! 📈 Trained models outperform official Instruct baselines 🔬 Covers 7+ disciplines with university-level textbook-grade QA 📄 Paper:…

252

136

19.0K

Yan Ma@ManTle_Ma · Jul 8

FacTool has been accepted to COLM 2025 - two years after its arXiv debut! While the landscape of LLMs has changed a lot since then, tool-augmented LLMs and RAG are still among the most effective and practical approaches for detecting / mitigating hallucinations (ref:…

EEthan Chern@ethanchern · Jul 27, 2023

In the era of 🤖#GenerativeAI, text of all forms can be generated by LLMs. How can we identify and rectify *factual errors* in the generated output? We introduce FacTool, a framework for factuality detection in Generative AI. Website: ethanc111.github.io/factool_websit… (1/n)

2.0K

Yan Ma@ManTle_Ma · Jul 2

Checkout our new survey, the next paradigm for multimodal chain of thoughts.🔥🔥

ZZhaochen Su@SuZhaochen0110 · Jul 2

Excited to share our new survey on the reasoning paradigm shift from "Think with Text" to "Think with Image"! 🧠🖼️ Our work offers a roadmap for more powerful & aligned AI. 🚀 📜 Paper: arxiv.org/pdf/2506.23918 ⭐ GitHub (400+🌟): github.com/zhaochen0110/A…

213

Yan Ma Retweeted

Zengzhi Wang@SinclairWang1 · Jun 26

What Makes a Base Language Model Suitable for RL? Rumors in the community say RL (i.e., RLVR) on LLMs is full of “mysteries”: (1) Is the magic only happening on Qwen + Math? (2) Does the "aha moment" only spark during math reasoning? (3) Is evaluation hiding some tricky traps?…

506

476

89.0K

Yan Ma@ManTle_Ma · Jun 18

I had previously misconfigured the 3D RoPE for QwenVL within Verl. Correcting this issue led to a notable improvement in the stability. It appears that position ID plays a significant role in the stability of RL post-training for VLMs.

ManTle_Ma's tweet image. I had previously misconfigured the 3D RoPE for QwenVL within Verl. Correcting this issue led to a notable improvement in the stability. It appears that position ID plays a significant role in the stability of RL post-training for VLMs.

168

Yan Ma@ManTle_Ma · Jun 16

It's an honor to be part of this great work.

MMiniMax (official)@MiniMax__AI · Jun 16

Day 1/5 of #MiniMaxWeek: We’re open-sourcing MiniMax-M1, our latest LLM — setting new standards in long-context reasoning. - World’s longest context window: 1M-token input, 80k-token output - State-of-the-art agentic use among open-source models - RL at unmatched efficiency:…

216

Yan Ma Retweeted

Zhaochen Su@SuZhaochen0110 · Jun 3

To further boost the "think with images" community, we've systematically summarized the latest research in our new repository: github.com/zhaochen0110/A… 🧠🖼️Let's make LVLMs see & think! A comprehensive survey paper will be released soon! Stay tuned.

4.0K

Yan Ma@ManTle_Ma · Jun 1

🥲 Just survived the COLM rebuttal. I asked Gemini one tough question, and it replied: NIONIONIONIONIONIONIONIONIONIONION... Again. And again. And again. Looks like I broke it. 🧠💥 → 🧅🧅🧅🧅 #COLM2025 #RebuttalSeason #LLMmeltdown #GeminiBreakdown #PromptTooStrong

ManTle_Ma's tweet image. 🥲 Just survived the COLM rebuttal.

I asked Gemini one tough question, and it replied:
NIONIONIONIONIONIONIONIONIONIONION...
Again. And again. And again.
Looks like I broke it.
🧠💥 → 🧅🧅🧅🧅
#COLM2025 #RebuttalSeason #LLMmeltdown #GeminiBreakdown #PromptTooStrong

2.0K

Yan Ma Retweeted

Yuqing Yang@yyqcode · May 29

🧐When do LLMs admit their mistakes when they should know better? In our new paper, we define this behavior as retraction: the model indicates that its generated answer was wrong. LLMs can retract—but they rarely do.🤯 arxiv.org/abs/2505.16170 👇🧵

112

14.0K

Yan Ma Retweeted

Pengfei Liu@stefan_fee · May 29

What if AI could mentally sketch its thoughts, just like you? The missing piece of AI multimodal reasoning is here! What if AI could daydream in images? Closing the imagination gap between humans and AI! Introducing Thinking with Generated Images — a new paradigm where large…

5.0K

Yan Ma Retweeted

Junteng Liu@junteng88716710 · May 29

Lack of RL Logical Reasoning data? Excited to share SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond 🚀 Building strong logical reasoning through RLVR 📄Paper: huggingface.co/papers/2505.19… 💻 Code: github.com/MiniMax-AI/Syn… (1/n)

170

114

18.0K

Yan Ma Retweeted

AK@_akhaliq · May 26

One RL to See Them All Visual Triple Unified Reinforcement Learning

502

303

79.0K

Yan Ma Retweeted

Yanheng He@YanhengHe · May 22

🔥 Excited to share our work "Efficient Agent Training for Computer Use" Q: Do computer use agents need massive data or complex RL to excel? A: No, with just 312 high-quality trajectories, Qwen2.5-VL can outperform Claude 3.7, setting a new SOTA for Windows computer use. 1/6

189

200

23.0K

Yan Ma@ManTle_Ma · May 4

Finally broke 100 citations! 🎉 I might be the slowest one in our group @ GAIR Lab to hit 100 citations though 😅#Research #AcademicGrowth #Milestone

ManTle_Ma's tweet image. Finally broke 100 citations! 🎉 I might be the slowest one in our group @ GAIR Lab to hit 100 citations though 😅#Research #AcademicGrowth #Milestone

358

Yan Ma@ManTle_Ma · Apr 29

🔥See here!

SShijie Xia@ShijieX60925 · Apr 29

I really love alphaXiv since it transforms arXiv to a community in a simple way @askalphaxiv . BTW, happy to see Cognition Engineering catch 700+ likes within one week. We have also updated a new version based on community feedback.