Steffi Chern

@steffichern

Incoming CS PhD @Penn | @NSF Graduate Fellow | undergrad @CarnegieMellon 🤠

Pittsburgh, Pennsylvania

Joined August 2020

363Following

139Followers

Pinned

Steffi Chern@steffichern · Jul 10, 2024

🚀How can we effectively evaluate and prevent superintelligent LLMs from deceiving others? We introduce 🤝BeHonest, a pioneering benchmark specifically designed to assess the honesty in LLMs comprehensively. Paper 📄: [arxiv.org/abs/2406.13261] Code 👨🏻‍💻: [github.com/GAIR-NLP/BeHon…]…

steffichern's tweet image. 🚀How can we effectively evaluate and prevent superintelligent LLMs from deceiving others?
We introduce 🤝BeHonest, a pioneering benchmark specifically designed to assess the honesty in LLMs comprehensively.

Paper 📄: [arxiv.org/abs/2406.13261]
Code 👨🏻‍💻: [github.com/GAIR-NLP/BeHon…]…

8.0K

Steffi Chern@steffichern · Jul 8

FacTool has been accepted to COLM 2025 - two years after its arXiv debut! While the landscape of LLMs has changed a lot since then, tool-augmented LLMs and RAG are still among the most effective and practical approaches for detecting / mitigating hallucinations (ref:…

EEthan Chern@ethanchern · Jul 27, 2023

In the era of 🤖#GenerativeAI, text of all forms can be generated by LLMs. How can we identify and rectify *factual errors* in the generated output? We introduce FacTool, a framework for factuality detection in Generative AI. Website: ethanc111.github.io/factool_websit… (1/n)

2.0K

Steffi Chern Retweeted

Zhaochen Su@SuZhaochen0110 · Jul 2

Excited to share our new survey on the reasoning paradigm shift from "Think with Text" to "Think with Image"! 🧠🖼️ Our work offers a roadmap for more powerful & aligned AI. 🚀 📜 Paper: arxiv.org/pdf/2506.23918 ⭐ GitHub (400+🌟): github.com/zhaochen0110/A…

161

15.0K

Steffi Chern Retweeted

Zengzhi Wang@SinclairWang1 · Jun 26

What Makes a Base Language Model Suitable for RL? Rumors in the community say RL (i.e., RLVR) on LLMs is full of “mysteries”: (1) Is the magic only happening on Qwen + Math? (2) Does the "aha moment" only spark during math reasoning? (3) Is evaluation hiding some tricky traps?…

506

476

89.0K

Steffi Chern Retweeted

Jiaxin Wen@jiaxinwen22 · Jun 11

New Anthropic research: We elicit capabilities from pretrained models using no external supervision, often competitive or better than using human supervision. Using this approach, we are able to train a Claude 3.5-based assistant that beats its human-supervised counterpart.

154

1.0K

224.0K

Steffi Chern Retweeted

Jyo Pari@jyo_pari · Jun 13

What if an LLM could update its own weights? Meet SEAL🦭: a framework where LLMs generate their own training data (self-edits) to update their weights in response to new inputs. Self-editing is learned via RL, using the updated model’s downstream performance as reward.

131

527

3.0K

591.0K

Steffi Chern Retweeted

Zhaochen Su@SuZhaochen0110 · Jun 3

To further boost the "think with images" community, we've systematically summarized the latest research in our new repository: github.com/zhaochen0110/A… 🧠🖼️Let's make LVLMs see & think! A comprehensive survey paper will be released soon! Stay tuned.

4.0K

Steffi Chern Retweeted

Yuqing Yang@yyqcode · May 29

🧐When do LLMs admit their mistakes when they should know better? In our new paper, we define this behavior as retraction: the model indicates that its generated answer was wrong. LLMs can retract—but they rarely do.🤯 arxiv.org/abs/2505.16170 👇🧵

113

14.0K

Steffi Chern Retweeted

Daniel Kokotajlo@DKokotajlo · Apr 3

"How, exactly, could AI take over by 2027?" Introducing AI 2027: a deeply-researched scenario forecast I wrote alongside @slatestarcodex, @eli_lifland, and @thlarsen

387

994

5.0K

4.0K

2.7M

Steffi Chern Retweeted

AK@_akhaliq · May 26

One RL to See Them All Visual Triple Unified Reinforcement Learning

503

303

79.0K

Steffi Chern Retweeted

Yanheng He@YanhengHe · May 22

🔥 Excited to share our work "Efficient Agent Training for Computer Use" Q: Do computer use agents need massive data or complex RL to excel? A: No, with just 312 high-quality trajectories, Qwen2.5-VL can outperform Claude 3.7, setting a new SOTA for Windows computer use. 1/6

188

200

23.0K

Steffi Chern Retweeted

Google DeepMind@GoogleDeepMind · May 20

We’ve developed Gemini Diffusion: our state-of-the-art text diffusion model. Instead of predicting text directly, it learns to generate outputs by refining noise, step-by-step. This helps it excel at coding and math, where it can iterate over solutions quickly. #GoogleIO

663

5.0K

1.0K

1.2M

Steffi Chern@steffichern · Apr 23

Thrilled to know that our paper, `Safety Alignment Should be Made More Than Just a Few Tokens Deep`, received the ICLR 2025 Outstanding Paper Award. We sincerely thank the ICLR committee for awarding one of this year's Outstanding Paper Awards to AI Safety / Adversarial ML.…

IICLR 2026@iclr_conf · Apr 23

Outstanding Papers Safety Alignment Should be Made More Than Just a Few Tokens Deep. Xiangyu Qi, et al. Learning Dynamics of LLM Finetuning. Yi Ren and Danica J. Sutherland. AlphaEdit: Null-Space Constrained Model Editing for Language Models. Junfeng Fang, et al.

350

115

44.0K

Steffi Chern Retweeted

Shijie Xia@ShijieX60925 · Apr 22

🔥 Happy to share our paper on test-time scaling (TTS)! 🚀 We take the position that generative AI has entered Act II, that is cognition engineering driven by TTS. 🛠️ We provide many valuable resources to help community utilize TTS to develop the cognitive ability of models.

1.0K

Steffi Chern Retweeted

Shunyu Yao@ShunyuYao12 · Apr 14

I finally wrote another blogpost: ysymyth.github.io/The-Second-Hal… AI just keeps getting better over time, but NOW is a special moment that i call “the halftime”. Before it, training > eval. After it, eval > training. The reason: RL finally works. Lmk ur feedback so I’ll polish it.

187

1.0K

688

265.0K

Steffi Chern Retweeted

Xiangkun Hu@XiangkunHu · Apr 9

🔍Exciting to introduce DeepResearcher, the first end-to-end trained #DeepResearch model with #RL scaling in real-world environments! ✨No more controlled simulations - this is RL in the wild with authentic search interactions! Paper: arxiv.org/pdf/2504.03160 1/7

4.0K

Steffi Chern Retweeted

Fan Zhou@FaZhou_998 · Apr 7

🥁🥁 Happy to share our latest efforts on math pre-training data, the MegaMath dataset! This is a 9-month project starting from 2024’s summer, and we finally deliver: the largest math pre-training data to date containing 💥370B 💥tokens of web, code, and synthetic data!

138

20.0K

Steffi Chern Retweeted

Anca Dragan@ancadianadragan · Apr 5

We've written a paper (145 pages!!) about our approach for AGI safety at @GoogleDeepMind. It's not just scalable oversight and interp -- so much more needs to come together. deepmind.google/discover/blog/…

428

228

45.0K

Steffi Chern@steffichern · Apr 5

Check out our newest, fully open-source RL framework for VLMs—built from scratch, reproducible, and tested on real benchmarks!

YYan Ma@ManTle_Ma · Apr 4

🔥 New paper drop! 🔥 🔍 In the fast-paced world of RL scaling, where leaderboard performance and rapid results take priority, the value of transparent, step-by-step exploration is often overlooked. Our latest work, MAYE, addresses this gap by introducing: 1️⃣ A from-scratch RL…

237

Steffi Chern@steffichern · Feb 17

#LIMR Less is More for RL Scaling! Less is More for RL Scaling! Less is More for RL Scaling! - What makes a good example for RL scaling? We demonstrate that a strategically selected subset of just 1,389 samples can outperform the full 8,523-sample dataset. - How to make a…

HHaoyang Zou@alanyoung8848 · Feb 17

🔥 Excited to share our work "LIMR: Less is More for RL Scaling" Q: What determines the effectiveness of RL training data ? A: Alignment with model's learning journey 1,389 strategic samples ≥ 8,523 full dataset 🤯 📄: github.com/GAIR-NLP/LIMR/… 💻: github.com/GAIR-NLP/LIMR 1/6

170

109

20.0K

Steffi Chern Retweeted

Junlong Li@lockonlvange · Feb 12

Introducing CodeI/O (codei-o.github.io), a systematic way to condense diverse reasoning patterns via code input-output prediction to build massive training data for more reasoning tasks beyond commonly focused math problem-solving and code generation, which usually suffer…

165

109

34.0K