Yanheng He (@YanhengHe)

Pinned

Y

Yanheng He@YanhengHe · May 22

🔥 Excited to share our work "Efficient Agent Training for Computer Use" Q: Do computer use agents need massive data or complex RL to excel? A: No, with just 312 high-quality trajectories, Qwen2.5-VL can outperform Claude 3.7, setting a new SOTA for Windows computer use. 1/6

YanhengHe's tweet image. 🔥 Excited to share our work "Efficient Agent Training for Computer Use"

Q: Do computer use agents need massive data or complex RL to excel?

A: No, with just 312 high-quality trajectories, Qwen2.5-VL can outperform Claude 3.7, setting a new SOTA for Windows computer use.

1/6

1

30

189

200

23.0K

Yanheng He Retweeted

A

Alexander Wei@alexwei_ · Jul 19

1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).

408

1.0K

7.0K

2.0K

5.3M

Y

Yanheng He@YanhengHe · Jul 17

Amazing step toward digital superintelligence!

OOpenAI@OpenAI · Jul 17

ChatGPT agent is ready to introduce itself. openai.com/live

0

2

0

134

Yanheng He Retweeted

S

Saining Xie@sainingxie · Jul 7

Thanks for bringing this to my attention. I honestly wasn’t aware of the situation until the recent posts started going viral. I would never encourage my students to do anything like this—if I were serving as an Area Chair, any paper with this kind of prompt would be…

10

29

215

25

37.0K

Y

Yanheng He@YanhengHe · Jun 27

The race for LLM "cognitive core" - a few billion param model that maximally sacrifices encyclopedic knowledge for capability. It lives always-on and by default on every computer as the kernel of LLM personal computing. Its features are slowly crystalizing: - Natively multimodal…

OOmar Sanseviero@osanseviero · Jun 26

I’m so excited to announce Gemma 3n is here! 🎉 🔊Multimodal (text/audio/image/video) understanding 🤯Runs with as little as 2GB of RAM 🏆First model under 10B with @lmarena_ai score of 1300+ Available now on @huggingface, @kaggle, llama.cpp, ai.dev, and more

382

1.0K

10.0K

5.0K

1.2M

Y

Yanheng He@YanhengHe · Jun 12

Had a great time at this CVPR community-building workshop---lots of fun discussions and some really important insights for early-career researchers. I also gave a talk on "Research as an Infinite Game." Here are the slides: canva.com/design/DAGp0iR…

AAnand Bhattad@anand_bhattad · Jun 10

In this #CVPR2025 edition of our community-building workshop series, we focus on supporting the growth of early-career researchers. Join us tomorrow (Jun 11) at 12:45 PM in Room 209 Schedule: sites.google.com/view/standoutc… We have an exciting lineup of invited talks and candid…

18

59

354

127

42.0K

Y

Yanheng He@YanhengHe · May 22

📣 New Discovery on Computer Use Agent With just 312 high-quality trajectories + open-source model, we've surpassed Claude 3.7 Sonnet (thinking) in computer use capabilities 🚀 ⚡️ In the new era of AI Agent training, many key questions remain: • Can open-source models + small…

JJiahe Jin@jiahe_Jin0123 · May 22

Excited to share PC Agent-E, our new work on efficient agent training for computer use! Trained with only❗️312 human trajectories enhanced by Claude 3.7 Sonnet, PC Agent-E achieves a 🤯 141% relative improvement, even surpasses Claude 3.7 Sonnet (thinking)!

0

6

24

9

3.0K

Y

Yanheng He@YanhengHe · May 22

312 quality trajectories + open-source model beats Claude 3.7 Sonnet (thinking) in computer use 🚀 We answer the following important questions in our recent tech report: github.com/GAIR-NLP/PC-Ag… 1. Can open-source models + small high-quality datasets outperform top closed-source…

YYanheng He@YanhengHe · May 22

🔥 Excited to share our work "Efficient Agent Training for Computer Use" Q: Do computer use agents need massive data or complex RL to excel? A: No, with just 312 high-quality trajectories, Qwen2.5-VL can outperform Claude 3.7, setting a new SOTA for Windows computer use. 1/6

0

5

36

19

4.0K

Yanheng He Retweeted

J

Jiahe Jin@jiahe_Jin0123 · May 22

Excited to share PC Agent-E, our new work on efficient agent training for computer use! Trained with only❗️312 human trajectories enhanced by Claude 3.7 Sonnet, PC Agent-E achieves a 🤯 141% relative improvement, even surpasses Claude 3.7 Sonnet (thinking)!

1

3

28

8

4.0K

Yanheng He Retweeted

H

Haoyang Zou@alanyoung8848 · Feb 17

🔥 Excited to share our work "LIMR: Less is More for RL Scaling" Q: What determines the effectiveness of RL training data ? A: Alignment with model's learning journey 1,389 strategic samples ≥ 8,523 full dataset 🤯 📄: github.com/GAIR-NLP/LIMR/… 💻: github.com/GAIR-NLP/LIMR 1/6

7

29

148

128

32.0K

Yanheng He Retweeted

J

Jiahe Jin@jiahe_Jin0123 · Feb 14

🤔 What makes 3D LLMs truly unique compared to 2D VLMs? 🕵️‍♂️ We uncovered the "2D-Cheating" problem in 3D LLM evaluation: Many tasks can be easily solved by 2D VLMs using rendered images, failing to test true 3D capabilities!

1

9

2

2.0K

Yanheng He Retweeted

J

Jiahe Jin@jiahe_Jin0123 · Dec 24

🤔 Struggling to train capable AI agents due to lack of quality data? 🚀 Meet PC Tracker & PC Agent - our groundbreaking system that learns from real human computer operation process to handle complex digital work! Watch how PC Agent automatically creates slides about Attention…

0

4

14

0

2.0K