Mir Miroyan

@mirmiroyan

cs phd @UCBerkeley sky lab | bair

Joined March 2024

115Following

87Followers

Mir Miroyan@mirmiroyan · Jul 21

why would we want LLMs to generate "buggy" code? if we want models to be better assistants, they need to better understand the user! if the model can act (e.g. code) like the user, we're a step closer. and what better place to explore this than in education?

RRose@rose_niousha · Jul 21

Can LLMs write code and learn like novice programmers? We release ParaStudent, a framework to study how to make LLMs generate realistic, student-like code, which is often imperfect, iterative, and stylistically diverse 👩‍🎓 Paper and code shared in the thread 👇

542

Mir Miroyan@mirmiroyan · May 28

incredible

llmarena.ai@lmarena_ai · May 27

The NEW LMArena is officially live! 🎉 ✨ New Logo! ⚡️ Better, faster UI/UX for chat and leaderboard 📱 Mobile optimized 💬 Chat history 🧭 Clearer leaderboard navigation 🤖 Many modalities in one place: vision, image, and more coming soon Try it now at lmarena dot ai! (Link in…

230

Mir Miroyan Retweeted

Melissa Pan@melissapan · Apr 25

🚨 Why Do Multi-Agent LLM Systems Fail? ⁉️ 🔥 Introducing MAST: The first multi-agent failure taxonomy - consists of 14 failure modes and 3 categories, generalizes for diverse multi-agent systems and tasks! Paper: arxiv.org/pdf/2503.13657 Code: github.com/multi-agent-sy… 🧵1/n

209

133

36.0K

Mir Miroyan@mirmiroyan · Apr 17

a great improvement over the gradio UI!

llmarena.ai@lmarena_ai · Apr 17

We're excited to invite everyone to a new Beta version of LMArena! 🎉 For months, we’ve been poring through community feedback to improve the site—fixing errors/bugs, improving our UI layout, and more. To keep supporting the development and continual improvement of this…

352

Mir Miroyan@mirmiroyan · Apr 14

excited to release the first checkpoint of the project. it's more than just a leaderboard -- we share some interesting findings in the LMArena blog (blog.lmarena.ai/blog/2025/sear…)

llmarena.ai@lmarena_ai · Apr 14

Exciting News! Search Arena Leaderboard🌐 🥇 Gemini-2.5-Pro-Grounding and Perplexity-Sonar-Reasoning-Pro top the leaderboard! Congrats @GoogleDeepMind and @perplexity_ai! 📊 We've open-sourced 7k battles with user votes! 📝 Check out our blog post for detailed analysis. Blog…

7.0K

Mir Miroyan@mirmiroyan · Mar 18

We finally have a platform to evaluate and rank how AI uses tools (in particular search) in the wild. Try asking questions that require (and don't require search)! The results are really interesting. I also want to congratulate my students @mirmiroyan and @tsunghan_wu for…

llmarena.ai@lmarena_ai · Mar 18

News: Search Arena is now LIVE! 🌐🔍 ✅ Test web-augmented LLM systems on real-time, real-world tasks — retrieval, writing, debugging & more. ✅ Perplexity, Gemini, OpenAI go head-to-head. ✅ Crowd-powered evals. Leaderboard 🏆 coming soon… ⚡Try it now at lmarena .ai!

4.0K

Mir Miroyan Retweeted

lmarena.ai@lmarena_ai · Mar 18

483

125

59.0K