Shiqi Chen (@shiqi_chen17)

Pinned

S

Shiqi Chen@shiqi_chen17 · May 2

🚀🔥 Thrilled to announce our ICML25 paper: "Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas"! We dive into the core reasons behind spatial reasoning difficulties for Vision-Language Models from an attention mechanism view. 🌍🔍 Paper:…

5

36

231

169

31.0K

S

Shiqi Chen@shiqi_chen17 · Jul 8

FacTool has been accepted to COLM 2025 - two years after its arXiv debut! While the landscape of LLMs has changed a lot since then, tool-augmented LLMs and RAG are still among the most effective and practical approaches for detecting / mitigating hallucinations (ref:…

EEthan Chern@ethanchern · Jul 27, 2023

In the era of 🤖#GenerativeAI, text of all forms can be generated by LLMs. How can we identify and rectify *factual errors* in the generated output? We introduce FacTool, a framework for factuality detection in Generative AI. Website: ethanc111.github.io/factool_websit… (1/n)

2

5

12

1

2.0K

S

Shiqi Chen@shiqi_chen17 · Jun 5

SynLogic is our effort to synthesize verifiable reasoning data for RL scaling. It covers 35 diverse logical reasoning tasks and supports synthesizing at controlled difficulty and quantity. It is not only about logical reasoning. We mixed math, coding, and SynLogic altogether to…

JJunteng Liu@junteng88716710 · May 29

Lack of RL Logical Reasoning data? Excited to share SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond 🚀 Building strong logical reasoning through RLVR 📄Paper: huggingface.co/papers/2505.19… 💻 Code: github.com/MiniMax-AI/Syn… (1/n)

1

14

66

25

5.0K

S

Shiqi Chen@shiqi_chen17 · May 29

We present Thinking with Generated Images, demonstrating how a single unified LMM can be trained to perform vision generation tasks using both textual and visual intermediate steps, along with critique and refinement capabilities. We believe that spontaneous multimodal thinking…

PPengfei Liu@stefan_fee · May 29

What if AI could mentally sketch its thoughts, just like you? The missing piece of AI multimodal reasoning is here! What if AI could daydream in images? Closing the imagination gap between humans and AI! Introducing Thinking with Generated Images — a new paradigm where large…

0

1

7

1

992

S

Shiqi Chen@shiqi_chen17 · May 29

Check out SynLogic😀 -- new logical reasoning data synthesizing framework/dataset, spanning diverse logical reasoning domains. It achieves SOTA📊 on logical reasoning tasks and even shows strong generalization to other areas like math and coding💡!

JJunteng Liu@junteng88716710 · May 29

Lack of RL Logical Reasoning data? Excited to share SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond 🚀 Building strong logical reasoning through RLVR 📄Paper: huggingface.co/papers/2505.19… 💻 Code: github.com/MiniMax-AI/Syn… (1/n)

0

1

405

S

Shiqi Chen@shiqi_chen17 · May 17

[ICML 2025] Reasoning and multimodal perception are among the most exciting capabilities of modern models. Surprisingly, we found that both abilities can emerge via model merging—without any training. It’s like a free lunch 🍱 for capabilities! Even more interestingly, this…

SShiqi Chen@shiqi_chen17 · May 15

Share our another #ICML25 paper: “Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging” ! (1/5) We use model merging to enhance VLMs' reasoning by integrating math-focused LLMs—bringing textual reasoning into multi-modal models. Surprisingly, this…

0

2

26

14

2.0K

S

Shiqi Chen@shiqi_chen17 · May 15

Share our another #ICML25 paper: “Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging” ! (1/5) We use model merging to enhance VLMs' reasoning by integrating math-focused LLMs—bringing textual reasoning into multi-modal models. Surprisingly, this…

0

13

89

44

5.0K

S

Shiqi Chen@shiqi_chen17 · Apr 28

We will be presenting SkyLadder poster at the Open Science of Foundation Models Workshop (Hall 4 #5) at 3pm today. Please drop by and have a look!

TTongyao Zhu@tongyao_zhu · Mar 24

🚀 Excited to share our new paper: SkyLadder: Better and Faster Pretraining via Context Window Scheduling! Have you ever noticed the ever-increasing⬆context window of pretrained language models? The first generation of GPT had a context length of 512, followed by 1024 for GPT2,…

0

5

23

5

2.0K

S

Shiqi Chen@shiqi_chen17 · Apr 23

LLM Agents + Multi-turn RL: Why your RL models always collapse even with a simple env? We work in controlled envs to reveal the failure patterns and discuss how possibly fix it. GitHub ⭐️1.4k, easy to add your own env, we are actively resolving issues and welcome…

ZZihan Wang - on RAGEN@wzihanw · Apr 23

Why does your RL training always collapse? In our new paper of RAGEN, we explore what breaks when you train LLM *Agents* with multi-turn reinforcement learning—and possibly how to fix it. 📄 github.com/RAGEN-AI/RAGEN… 🌐 ragen-ai.github.io 1/🧵👇

3

25

103

44

13.0K

Shiqi Chen Retweeted

c

chang ma@ma_chang_nlp · Apr 21

Excited to share our work at ICLR 2025 in 🇸🇬. @iclr_conf 🥳 Happy to chat about LLM reasoning & planning, agents, and AI4Science! 📍Sat 26 Apr 3 p.m. CST — 5:30 p.m Hall 3 + Hall 2B #554

0

8

34

2

2.0K