Yujia Qin

@TsingYoga

ByteDancer, Agent, THU (16-20 BS in EE, 20-24 PhD in CS)

Beijing

Joined February 2019

329Following

5KFollowers

Pinned

Yujia Qin@TsingYoga · Apr 17

Introducing UI-TARS-1.5, a vision-language model that beats OpenAI Operator and Claude 3.7 on GUI Agent and Game Agent tasks. We've open-sourced a small-size version model for research purposes, more details can be found in our blog. TARS learns solely from a screen, but…

205

106

83.0K

Yujia Qin Retweeted

Deedy@deedydas · Jul 3

China's Gaokao is the biggest exam in the world: 13M test takers and 9hrs. ~0.02% make it to the top uni, Tsinghua. As of this week, AI models can make it too. 625/750 is top 1%ile. Highest human score is ~720-740. Gemini 2.5 Pro gets 655, barely making the cut for Tsinghua!

208

2.0K

519

215.0K

Yujia Qin@TsingYoga · Jun 29

Meet Agent TARS Beta, based on Seed1.5-VL

AAgent TARS@agent_tars · Jun 28

Since we have released a brand new new Agent TARS CLI based on Seed1.5-VL, see agent-tars.com/beta , we have to say goodbye to the old Agent TARS Desktop github.com/bytedance/UI-T…

2.0K

Yujia Qin Retweeted

ULIVZ@_ulivz · Jun 25

Introducing Agent TARS Beta — a brand new and more powerful Agent TARS! - Agent TARS CLI - Browser Agent driven by Seed-1.5-VL - Native Streaming - Multimodal-friendly Web UI - Layered Agent architecture Blog: agent-tars.com/beta Quick start: agent-tars.com/quick-start

20.0K

Yujia Qin Retweeted

zhh@_zhaoheh_ · Jun 25

🚀 UI-TARS Desktop v0.2.1 is now live! Free Remote Computer & Browser Operator are ready to roll—no setup, just click and go🎁! Get started: 🔽Download: github.com/bytedance/UI-T… 🔽Quick Start: github.com/bytedance/UI-T…

576

Yujia Qin Retweeted

ULIVZ@_ulivz · Jun 25

We have a major release coming up

1.0K

Yujia Qin Retweeted

Jason Wei@_jasonwei · Jun 18

One way of thinking about what AI will automate first is via the “description-execution gap”: how much harder is it to describe the task than to actually do it? Tasks with large description-execution gaps will be ripe for automation because it’s easy to create training data and…

390

264

71.0K

Yujia Qin@TsingYoga · Jun 12

Had a great time at this CVPR community-building workshop---lots of fun discussions and some really important insights for early-career researchers. I also gave a talk on "Research as an Infinite Game." Here are the slides: canva.com/design/DAGp0iR…

AAnand Bhattad@anand_bhattad · Jun 10

In this #CVPR2025 edition of our community-building workshop series, we focus on supporting the growth of early-career researchers. Join us tomorrow (Jun 11) at 12:45 PM in Room 209 Schedule: sites.google.com/view/standoutc… We have an exciting lineup of invited talks and candid…

354

127

42.0K

Yujia Qin Retweeted

Yu Su@ysu_nlp · May 26

I’d argue that computer use, in principle, is much harder than math/coding for current AI. the digital world encompasses a much larger part of the complexity in this world. The goals are often vastly underspecified and require accessing and understanding broad context (in users’…

2.0K

Yujia Qin@TsingYoga · May 25

Guess it's the first open-source multi-turn e2e RL for GUI Agents from academia, and it's based on UI-TARS-1.5-7B. If you want to study multimodal Agent RL, it is a good startpoint~ arxiv.org/abs/2505.16282

TsingYoga's tweet image. Guess it's the first open-source multi-turn e2e RL for GUI Agents from academia, and it's based on UI-TARS-1.5-7B.

If you want to study multimodal Agent RL, it is a good startpoint~
arxiv.org/abs/2505.16282

437

373

56.0K

Yujia Qin@TsingYoga · May 25

The cua community is really amazing!!

DDhruv@dhruv2038 · May 25

We now have local computer-use! M3 Pro 18GB running both UI-TARS-1.5-7B-6bit and a macOS sequoia VM entirely locally using MLX and c/ua at ~30second/action. Do it yourself here : github.com/trycua/cua

1.0K

Yujia Qin@TsingYoga · May 24

Interesting to know the previous operator is based on 4o, not even o1... OpenAI is shifting from reasoning models (o3) to agent models (operator, codex, and deepresearch), with gradual integration of agent data streams from multiple teams—evident in GAIA’s jump from 12.3 to 62.2…

OOpenAI@OpenAI · May 23

Operator 🤝 OpenAI o3 Operator in ChatGPT has been updated with our latest reasoning model. operator.chatgpt.com

9.0K

Yujia Qin Retweeted

Lucas Beyer (bl16)@giffmana · May 21

OK, ByteDance Seed is now firmly a top tier lab in my mind. Congrats on many solid works recently, continuously publishing and even releasing models. A shame that I am really really bad at remembering individual Chinese names though :-/

698

152

75.0K

Yujia Qin@TsingYoga · May 21

Interesting findings on the behavior emerging from unified multimodal scaling

AAK@_akhaliq · May 21

ByteDance just dropped BAGEL on Hugging Face The Open-Source Unified Multimodal Model

9.0K