Yeda Song

@__runamu__

Multimodal Agents for the Real World: GUI Agents, VLM, and RL @ UMich 🇺🇸

Ann Arbor, Michigan, USA

Joined January 2022

223Following

178Followers

Pinned

Yeda Song@__runamu__ · May 27

🔥 GUI agents struggle with real-world mobile tasks. We present MONDAY—a diverse, large-scale dataset built via an automatic pipeline that transforms internet videos into GUI agent data. ✅ VLMs trained on MONDAY show strong generalization ✅ Open data (313K steps) (1/7) 🧵 #CVPR

__runamu__'s tweet image. 🔥 GUI agents struggle with real-world mobile tasks.
We present MONDAY—a diverse, large-scale dataset built via an automatic pipeline that transforms internet videos into GUI agent data.
✅ VLMs trained on MONDAY show strong generalization
✅ Open data (313K steps) (1/7) 🧵
#CVPR

6.0K

Yeda Song@__runamu__ · Jul 15

Flow Q-learning (FQL) is a simple method to train/fine-tune an expressive flow policy with RL. Come visit our poster at 4:30p-7p this Wed (evening session, 2nd day)!

SSeohong Park@seohong_park · Feb 5

Excited to introduce flow Q-learning (FQL)! Flow Q-learning is a *simple* and scalable data-driven RL method that trains an expressive policy with flow matching. Paper: arxiv.org/abs/2502.02538 Project page: seohong.me/projects/fql/ Thread ↓

502

308

47.0K

Yeda Song@__runamu__ · Jul 12

✨Two life updates✨ 1. Started my internship at @LG_AI_Research in Ann Arbor, Michigan — Advancing AI for a better life! 🔮 2. Advanced to PhD candidacy at UMich CSE. This means I’ve completed my coursework and passed the qualification process. 🙌

144

11.0K

Yeda Song@__runamu__ · Jun 27

The race for LLM "cognitive core" - a few billion param model that maximally sacrifices encyclopedic knowledge for capability. It lives always-on and by default on every computer as the kernel of LLM personal computing. Its features are slowly crystalizing: - Natively multimodal…

OOmar Sanseviero@osanseviero · Jun 26

I’m so excited to announce Gemma 3n is here! 🎉 🔊Multimodal (text/audio/image/video) understanding 🤯Runs with as little as 2GB of RAM 🏆First model under 10B with @lmarena_ai score of 1300+ Available now on @huggingface, @kaggle, llama.cpp, ai.dev, and more

382

1.0K

10.0K

5.0K

1.2M

Yeda Song Retweeted

Sangwoo Mo@sangwoomo · Jun 16

Can scaling data and models alone solve computer vision? 🤔 Join us at the SP4V Workshop at #ICCV2025 in Hawaii to explore this question! 🎤 Speakers: @danfei_xu, @joaocarreira, @jiajunwu_cs, Kristen Grauman, @sainingxie, @vincesitzmann 🔗 sp4v.github.io

21.0K

Yeda Song Retweeted

MichiganAI@michigan_AI · Jun 10

We're heading to #CVPR2025! 📰Curious about what’s coming? Take a look at our list of accepted papers and come to meet the authors! Get ready for innovative #AI research and fresh insights!

748

Yeda Song Retweeted

Furong Huang@furongh · Jun 10

Excited to speak at the Workshop on Computer Vision in the Wild @CVPR 2025! 🎥🌍 🗓️ June 11 | 📍 Room 101 B, Music City Center, Nashville, TN 🎸 🧠 Talk: From Perception to Action: Building World Models for Generalist Agents Let’s connect if you're around! #CVPR2025 #robotics…

6.0K

Yeda Song Retweeted

Jianwei Yang@jw2yang4ai · Apr 28

🚀 Excited to announce our 4th Workshop on Computer Vision in the Wild (CVinW) at @CVPR 2025! 🔗 computer-vision-in-the-wild.github.io/cvpr-2025/ ⭐We have invinted a great lineup of speakers: Prof. Kaiming He, Prof. @BoqingGo, Prof. @CordeliaSchmid, Prof. @RanjayKrishna, Prof. @sainingxie, Prof.…

103

27.0K

Yeda Song@__runamu__ · Jun 10

Arrived in Nashville for #CVPR 🤠 Excited to present MONDAY, a collaboration with @LG_AI_Research! 📍 MMFM Workshop - Thu, 9:40 AM 📍 Main Conference - Fri, 4:00 PM Let’s connect and chat!🤝 Also exploring Summer 2026 internships 🔍 MONDAY website: monday-dataset.github.io

522

Yeda Song Retweeted

Shunyu Yao@ShunyuYao12 · Apr 14

I finally wrote another blogpost: ysymyth.github.io/The-Second-Hal… AI just keeps getting better over time, but NOW is a special moment that i call “the halftime”. Before it, training > eval. After it, eval > training. The reason: RL finally works. Lmk ur feedback so I’ll polish it.

187

1.0K

690

265.0K

Yeda Song Retweeted

Kenneth Li@ke_li_2021 · Sep 9

LLM chatbots are moving fast, but how do we make them better? In my new blog at The Gradient, I argue that an important next step is giving them a sense of "purpose."

8.0K

Yeda Song Retweeted

Rada Mihalcea@radamihalcea · Sep 6

I love our Michigan AI Lab @michigan_AI! A group of people who not only does some of the coolest research in AI, but also care for and of each other, and enjoy each other’s company. A picture from this week’s fun picnic. ❤️

124

13.0K

Yeda Song Retweeted

Sangdoo Yun@oodgnas · Jul 23, 2023

Glad to share our work at #ACL2023, "MPChat: Towards Multimodal Persona-Grounded Conversation" arxiv.org/abs/2305.17388 ! #multimodal #persona_chat authors: @AHNJAEWOO2, @__runamu__, Gunhee Kim

894