Xinyu Yang

@Xinyu2ML

Ph.D. @CarnegieMellon. Working on data and hardware-driven principled algorithm & system co-design for scalable and generalizable foundation models. They/Them

Pittsburgh, US

Joined December 2022

880Following

840Followers

Pinned

Xinyu Yang@Xinyu2ML · Jun 16

🚀 Super excited to share Multiverse! 🏃 It’s been a long journey exploring the space between model design and hardware efficiency. What excites me most is realizing that, beyond optimizing existing models, we can discover better model architectures by embracing system-level…

IInfini-AI-Lab@InfiniAILab · Jun 16

🔥 We introduce Multiverse, a new generative modeling framework for adaptive and lossless parallel generation. 🚀 Multiverse is the first open-source non-AR model to achieve AIME24 and AIME25 scores of 54% and 46% 🌐 Website: multiverse4fm.github.io 🧵 1/n

10.0K

Pinned

Xinyu Yang@Xinyu2ML · Jul 23

Agentic foundation models have emerged as a promising direction towards AGI. However, we are still in the early stages of developing such models capable of multi-modal reasoning—an essential capability for enabling most real-world applications. We warmly invite you to submit your…

ZZhenfei (Jeremy) Yin @ ICML 2025@9LdROhjZE56jSh9 · May 26

🚨 We’re thrilled to announce our ICCV 2025 Workshop: MMRAgI – Multi-Modal Reasoning for Agentic Intelligence! 🚨 🌐 Homepage: agent-intelligence.github.io/agent-intellig… 📥 Submit: openreview.net/group?id=thecv… 🗓️ Submission Deadline (Proceeding Track): June 24th 2025 23:59 AoE 🗓️ Submission Deadline…

877

Pinned

Xinyu Yang Retweeted

Jonas Geiping@jonasgeiping · Jul 22

A recurrent depth/Huginn-3.5B Update: I orginally wanted to post these more often, but I guess time is a river, and I just don't like posting all that much yet... The most interesting finding about the depth recurrent model has been this unassuming chart, actually:

4.0K

Xinyu Yang@Xinyu2ML · Jul 26

I’m gonna be recruiting students thru both @LTIatCMU (NLP) and @CMU_EPP (Engineering and Public Policy) for fall 2026! If you are interested in reasoning, memorization, AI for science & discovery and of course privacy, u can catch me at ACL! Prospective students fill this form:

NNiloofar (✈️ ACL)@niloofar_mire · May 6

📣Thrilled to announce I’ll join Carnegie Mellon University (@CMU_EPP & @LTIatCMU) as an Assistant Professor starting Fall 2026! Until then, I’ll be a Research Scientist at @AIatMeta FAIR in SF, working with @kamalikac’s amazing team on privacy, security, and reasoning in LLMs!

243

33.0K

Xinyu Yang Retweeted

Tilde@tilderesearch · Jul 25

Mixture‑of‑Experts (MoE) powers many frontier models like R1, K2, & Qwen3 ⚡️ To make frontier-scale MoE models accessible to train, we open-source MoMoE, a hyper-performant MoE implementation built for training and inference, outpacing the fastest existing ones by up to: - 70%…

332

241

36.0K

Xinyu Yang@Xinyu2ML · Jul 25

I used to underestimate the importance of prompt engineering. However, after working on Multiverse, I’ve come to realize that the success of LLMs in solving highly challenging tasks is deeply tied to prompt design. For example, generating each training example for Multiverse…

LLin Yang@lyang36 · Jul 24

Code release! 🚀 Following up on our IMO 2025 results with the public LLM Gemini 2.5 Pro — here’s the full pipeline & general (non-problem-specific) prompts. 👉 [github.com/lyang36/IMO25] Have fun exploring! #AI #Math #LLMs #IMO2025

4.0K

Xinyu Yang@Xinyu2ML · Jul 24

LLin Yang@lyang36 · Jul 22

🚨 Olympiad math + AI: We ran Google’s Gemini 2.5 Pro on the fresh IMO 2025 problems. With careful prompting and pipeline design, it solved 5 out of 6 — remarkable for tasks demanding deep insight and creativity. The model could win gold! 🥇 #AI #Math #LLMs #IMO2025

286

167

41.0K

Xinyu Yang@Xinyu2ML · Jul 25

Maybe a better talk of Multiverse haha (After 10+ talk

SSansa Gong@sansa19739319 · Jul 25

recording link: hku.zoom.us/rec/share/TyPL… pwd: T4Y1Z.99

454

Xinyu Yang Retweeted

Mihir Prabhudesai@mihirp98 · Jul 22

🚨 The era of infinite internet data is ending, So we ask: 👉 What’s the right generative modelling objective when data—not compute—is the bottleneck? TL;DR: ▶️Compute-constrained? Train Autoregressive models ▶️Data-constrained? Train Diffusion models Get ready for 🤿 1/n

122

172

1.0K

838

178.0K

Xinyu Yang@Xinyu2ML · Jul 22

Recording: youtube.com/watch?v=aNgg6M…

SSonglin Yang@SonglinYang4 · Jul 22

Happening now!

12.0K

Xinyu Yang Retweeted

Zexue He@ZexueHe · Jul 20

1/3✅Our LCFM at #ICML2025 workshop wrapped up successfully! 👏Huge thanks to our speakers for sharing cutting-edge insights: @tri_dao @PangWeiKoh @bmwshop @jiajunwu_cs @volokuleshov 👏And to our panelists for the inspiring discussion: @YuandongTian @MohitIyyer @bmwshop @Xinyu2ML

1.0K

Xinyu Yang Retweeted

Alexander Wei@alexwei_ · Jul 19

1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).

407

1.0K

7.0K

2.0K

5.3M

Xinyu Yang Retweeted

Zeyu Huang@ZeroyuHuang · Jul 18

🚀 Introducing Prefix-RFT to blend SFT and RFT! SFT can learn more complex problems by mimicking, but can have poor generalization. RFT has better overall performance but is limited by the initial policy. Our method, Prefix-RFT, makes the best of both worlds!

184

136

20.0K

Xinyu Yang@Xinyu2ML · Jul 18

Join us next week, we are presenting the Multiverse the great HKUNLP seminar!

HHKUNLP@hkunlp2020 · Jul 18

Xinyu Yang from CMU will be giving a talk titled "Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation" at Friday July 25 11am HKT (Thursday July 24 8pm PDT). Link to talk: hku.zoom.us/j/92651812689?…

1.0K

Xinyu Yang Retweeted

ES-FoMo@ICML2025@ESFoMo · Jul 18

Looking forward to seeing everyone for ES-FoMo part three tomorrow! We'll be in East Exhibition Hall A (the big one), and we've got an exciting schedule of invited talks, orals, and posters planned for you tomorrow. Let's meet some of our great speakers! 1/

43.0K

Xinyu Yang Retweeted

Liliang Ren@liliang_ren · Jul 18

We’re open-sourcing the pre-training code for Phi4-mini-Flash, our SoTA hybrid model that delivers 10× faster reasoning than Transformers — along with μP++, a suite of simple yet powerful scaling laws for stable large-scale training. 🔗 github.com/microsoft/Arch… (1/4)

216

1.0K

136.0K

Xinyu Yang@Xinyu2ML · Jul 17

Things are moving fast with our team (all good news!) We’re baking something really exciting. Sadly missing @icmlconf this year in person, but I’ll be giving a virtual oral at the R2-FM Workshop (Reliable & Responsible Foundation Models): 📍 Sat, 10:40–11:00 AM PT 📍 West…

ZZichen Chen (🐱,💖)@my_cat_can_code · Jun 5

Benchmarks say “perfect score.” 😇 A model scoring that high can still lose $100,000 on a single decision. 😈 Our position paper argues: safety must headline evaluation for LLM finance agents. [📖arxiv.org/abs/2502.15865] We outline a 3-layer audit recipe (model, workflow &…

2.0K

Xinyu Yang@Xinyu2ML · Jul 17

Shouldn't that be placed in China or at least Asia given that the majority of attendees with visa issues are from China or other Asian countries.

NNeurIPS Conference@NeurIPSConf · Jul 16

We're excited to announce a second physical location for NeurIPS 2025, in Mexico City. By expanding our physical locations, we hope to address concerns around skyrocketing attendance and difficulties in obtaining travel visas that some attendees have experienced in the past few…

289

51.0K

Xinyu Yang Retweeted

Jiacheng Ye@JiachengYe15 · Jul 15

📢 Update: Announcing Dream's next-phase development. - Dream-Coder 7B: A fully open diffusion LLM for code delivering strong performance, trained exclusively on public data. - DreamOn: targeting the variable-length generation problem in dLLM!

9.0K

Xinyu Yang Retweeted

Jiaxin Shi@thjashin · Jul 15

Autoregressive models are too restrictive by forcing a fixed generation order, while masked diffusion is wasteful as it fits all possible orders. Can our model dynamically decide the next position to generate based on context? Learn more in our ICML paper arxiv.org/abs/2503.05979

434

275

24.0K

Xinyu Yang@Xinyu2ML · Jul 14

Asynchronous decoding: multiple LLM threads write different parts of an answer in parallel. In Feb we (MIT×Google) introduced PASTA—the first async-dec method that uses policy learning to optimize latency & quality end-to-end. See us @ E-2600, East Hall A-B, Tue 11pm #ICML.

MMIT CSAIL@MIT_CSAIL · Jul 14

A new approach from CSAIL & Google marks a shift toward teaching models to orchestrate their own parallel decoding strategy. The team's "Parallel Structure Annotation" (PASTA) enables LLMs to generate text in parallel, accelerating their response times: bit.ly/4eDsVVo

1.0K