Wayne Chi
@iamwaynechi
CS Ph.D. at @SCSatCMU. Funded by @NDSEG Fellowship. Editor at http://blog.ml.cmu.edu. @MSFTResearch over Summer
What do developers 𝘳𝘦𝘢𝘭𝘭𝘺 think of AI coding assistants? In October, we launched @CopilotArena to collect user preferences on real dev workflows. After months of live service, we’re here to share our findings in our recent preprint. Here's what we have learned /🧵
Introducing Copilot Arena - Interactive coding evaluation in the wild. Our extension lets you test top models for free, right in VSCode. Let's vote and build the Copilot leaderboard! Download here: marketplace.visualstudio.com/items?itemName… Led by @iamwaynechi and @valeriechen_ at CMU. 1/🧵
We’re featured in the new tech report on Mercury models! Check it out👇
Since our launch earlier this year, we are thrilled to witness the growing community around dLLMs. The Mercury tech report from @InceptionAILabs is now on @arxiv with more extensive evaluations: arxiv.org/abs/2506.17298 New model updates dropping later this week!
Come check us out tomorrow at 9:55am at our first workshop oral! #ICML2025
Heading to Vancouver for ICML✈️🇨🇦Let’s chat about coding agents, evals, and human-AI collab. I’ll also be on the job market this upcoming cycle, looking for TT faculty roles + post-docs. Here's where you'll be able to find me this week👇
Excited to see everyone @icmlconf! Please drop by our poster session for @CopilotArena on Tuesday!
Mark is also cracked so...
You don’t need a PhD to be a great AI researcher. Even @OpenAI’s Chief Research Officer doesn’t have a PhD.
Super cool insights on AI usage!
We surveyed hundreds of engineers building in AI about everything from which models they’re using to whether they’re using a dedicated vector database. And, of course, if they think everyone will have AI girlfriends by 2030. Some highlights 🧵:
Excited to announce 🎵Magenta RealTime, the first open weights music generation model capable of real-time audio generation with real-time control. 👋 **Try Magenta RT on Colab TPUs**: colab.research.google.com/github/magenta… 👀 Blog post: g.co/magenta/rt 🧵 below
You're absolutely right You're absolutely right You're absolutely right
New result: Qwen-2.5-Coder jumps from 13th to joint 1st place with fill-in-the-middle (FiM)! Congrats to @Alibaba_Qwen 🥳 Also check out @lmarena_ai 's new UI 🖥️✨
blog.ml.cmu.edu/2025/06/01/rlh… In this in-depth coding tutorial, @GaoZhaolin and @g_k_swamy walk through the steps to train an LLM via RL from Human Feedback!
Crazy that it's been almost a decade since my last internship... Super excited to be at @MSFTResearch this summer! Will hopefully build an awesome new agentic system with @bansalg_ and @HsseinMzannar

Who is winning the race to claim the LLMs for SWE market? We share our thoughts based on our @CopilotArena work. See article below for current sentiments and what lies ahead 👇
OpenAI is making a big push into one of the most popular AI domains: software engineering on.wsj.com/3SCvoW2