So Yeon (Tiffany) Min
@SoYeonTiffMin
Member of Technical Staff @AnthropicAI Prev: @Apple,@Meta, PhD @mldcmu, B.S. and M.Eng from @MITEECS.
I am on the industry job market, and am planning to interview around next March. I am attending @NeurIPSConf, and I hope to meet you there if you are hiring! My website: soyeonm.github.io Short bio about me: I am a 5th year PhD student at CMU MLD, working with @rsalakhu…
This project was an incredible collaboration between conservation ecologists at the MPG Ranch and our lab at CMU, focused on studying and eradicating the invasive plant "Leafy Spurge" using AI. cs.cmu.edu/news/2025/gen-… As part of this work, we are also releasing a unique…
Proud and happy to see OpenAgentSafety coming out! Further pushing the frontier of interactional safety risks in human-AI agent collaboration. Kudos to @sanidhya903 and @Aditya_Soni_8 who led the projects!
1/ AI agents are increasingly being deployed for real-world tasks, but how safe are they in high-stakes settings? 🚨 NEW: OpenAgentSafety - A comprehensive framework for evaluating AI agent safety in realistic scenarios across eight critical risk categories. 🧵
This is my lecture from 2 months ago at @Cornell “How do I increase my output?” One natural answer is "I will just work a few more hours." Working longer can help, but eventually you hit a physical limit. A better question is, “How do I increase my output without increasing…
Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical network that replaces tokenization with a dynamic chunking process directly inside the model, automatically discovering and operating over meaningful units of data
Grok4 is here! Join us in building the world's largest and most advanced infrastructure for training and serving the world's best model. Our supercomputing team is hiring! job-boards.greenhouse.io/xai/jobs/47048…
When we make rapid progress, we tend to double down on the working paradigm. Bitter lessons suggest that this can be risky for deep learning. When trying to make a new paradigm to work at all, it is often necessary to add structures, e.g. clever modeling ideas. These structures…
I really like this diagram from @_jasonwei and @hwchung27 about how to view the bitter lesson: It's a mistake not to add structure now, it's a mistake to not remove that structure later. We're at the precipice of setting up a huge, powerful RL training run that will define the…
Very excited to share that HAICosystem has been accepted to #COLM2025 ! 🎉 Multi-turn, interactive evaluation is THE future, think Tau-Bench, TheAgentCompany, Sotopia, ... Proud to take a small step toward open-ended, interactive AI safety eval, and excited for what’s next! 😎
1/ What if you could see how your AI handles the chaos of the real world? Meet HAICOSYSTEM: the framework to simulate human-AI-environment interactions—all at once. 🌍🤖 Find out if your AI is truly safe under pressure from real-world scenarios! 🔥 🌐: haicosystem.org
What we're seeing in AI will also happen in other technical fields. While AI’s expected impact is undeniably large, that's not unique; there are other hugely valuable areas, e.g. robotics, longevity. What truly differentiates AI is its rate of progress, specifically how it's…
We have long been accustomed to planning life around a 30-year career. As our healthspan increases, that assumption is increasingly wrong. What would you do differently if your career were, say, 100 years instead of 30? Which options have you subconsciously given up because you…
Excited to see more investigation into agentic misalignment! We have some pioneering work on this topic as well: 🔬 HAICOSYSTEM - Framework for testing AI agent safety in complex human-AI interactions arxiv.org/abs/2409.16427 🎯 AI-LieDar - Studies the utility and truthfulness…
New Anthropic Research: Agentic Misalignment. In stress-testing experiments designed to identify risks before they cause real harm, we find that AI models from multiple providers attempt to blackmail a (fictional) user to avoid being shut down.
Presenting FACTR today at #RSS2025 in the Imitation Learning I session at 5:30pm (June 22). Come by if you're interested in force-feedback teleop and policy learning!
Low-cost teleop systems have democratized robot data collection, but they lack any force feedback, making it challenging to teleoperate contact-rich tasks. Many robot arms provide force information — a critical yet underutilized modality in robot learning. We introduce: 1. 🦾A…
Come interact with @allhands_ai !!! And let us know if u would want ur own summary/user report 😁
For fun, we asked OpenHands to take a look at some of my github commit and interaction history and figure out what kind of developer I am. I need to add this to my resume next time I apply for a SWE role 😆 HT @nlpxuhui
Interested in how generative AI can be used for human-robot interaction? We’re organizing the 2nd Workshop on Generative AI for Human-Robot Interaction (GenAI-HRI) at #RSS2025 in LA — bringing together the world's leading experts in the field. The workshop is happening on Wed,…
Congratulations to LTI Assistant Prof @ybisk on his Amazon Research Award! lti.cmu.edu/news-and-event…
Holy cow! It has been over 10 years - no way! Feels like I was giving this tutorial just a few years ago.
Check out my (somewhat) recent KDD tutorial on deep nets, RBMs, DBNs, DBMs, and multimodal deep learning: videolectures.net/kdd2014_salakh…
Heading to #CVPR2025 to present our Oral paper with @NVIDIARobotics! 📅 June 14 (Sat) | 🕐 1:00 PM | 📍Oral Session 4B @ ExHall A2 I’ll also be at the 3D-VLA/VLM and EVAL-FoMo 2 workshops presenting the same work. Come say hi!
🔥 VLMs aren’t built for spatial reasoning — yet. They hallucinate free space. Misjudge object fit. Can’t tell below from behind We built RoboSpatial to tackle that — a dataset for teaching spatial understanding to 2D/3D VLMs for robotics. 📝 Perfect review scores @CVPR 2025
Future AI systems interacting with humans will need to perform social reasoning that is grounded in behavioral cues and external knowledge. We introduce Social Genome to study and advance this form of reasoning in models! New paper w/ Marian Qian, @pliang279, & @lpmorency!
Lots of interest in AI reasoning, but most use cases involve structured inputs (text) with automatic and objective verifiers (e.g. coding, math). @lmathur_'s latest work takes an ambitious step towards social reasoning in AI, a task where inputs are highly multimodal (verbal and…
Future AI systems interacting with humans will need to perform social reasoning that is grounded in behavioral cues and external knowledge. We introduce Social Genome to study and advance this form of reasoning in models! New paper w/ Marian Qian, @pliang279, & @lpmorency!
Differences in model quality are magnified with task difficulty. So if you work on harder problems, you benefit more from AI progress. Good forcing function to work on more challenging problems!
RL with verifiable reward has shown impressive results in improving LLM reasoning, but what can we do when we do not have ground truth answers? Introducing Self-Rewarding Training (SRT): where language models provide their own reward for RL training! 🧵 1/n