Sihao Chen

@soshsihao

Researcher @ Microsoft #OAR. Learning AI models from experience. Previously: @upennnlp @cogcomp @GoogleAI. Opnions my own.

Seattle, WA

Joined March 2014

536Following

905Followers

Pinned

Sihao Chen@soshsihao · Oct 15

Life update: I defended my Ph.D. thesis and have joined @Microsoft's Office of Applied Research (OAR)! One big takeaway from my Ph.D study is that -- research is all about translating ideas into impacts. I feel blessed to work with talented researchers who share the same values!

soshsihao's tweet image. Life update: I defended my Ph.D. thesis and have joined @Microsoft's Office of Applied Research (OAR)!

One big takeaway from my Ph.D study is that -- research is all about translating ideas into impacts. I feel blessed to work with talented researchers who share the same values!

1.0K

113

69.0K

Sihao Chen Retweeted

Sheshera Mysore@MSheshera · Jun 27

🤔 We know what people are using LLMs for, but do we know how they collaborate with an LLM? 🔍 In a recent paper we answered this by analyzing multi-turn sessions in 21 million Microsoft Copilot for consumers and WildChat interaction logs: arxiv.org/abs/2505.16023

5.0K

Sihao Chen Retweeted

Han Guo@HanGuo97 · Jun 6

We know Attention and its linear-time variants, such as linear attention and State Space Models. But what lies in between? Introducing Log-Linear Attention with: - Log-linear time training - Log-time inference (in both time and memory) - Hardware-efficient Triton kernels

198

1.0K

848

257.0K

Sihao Chen@soshsihao · Jun 4

Huge congrats @sharma_ashish_2!!👏👏

AAssociation for Computing Machinery@TheOfficialACM · Jun 4

🎓 Congrats to Ashish Sharma, @UW on receiving the ACM Doctoral Dissertation Award for his dissertation, "Human-AI Collaboration to Support Mental Health and Well Being." 👏 Honorable Mentions: Alexander Kelley, @UofIllinois Sewon Min, @UCBerkeley

426

Sihao Chen Retweeted

Omar Khattab@lateinteraction · May 27

Missing nuance in the collective realization today: The non-trivial negative result is not that "RL just amplifies skills that are already there with low probability". Duh, that's obvious and not an issue actually. What got questioned today is that "dumb pretraining teaches the…

121

9.0K

Sihao Chen@soshsihao · May 24

🔥🔥Let's start cooking 😎😎

TTaiwei Shi@taiwei_shi · May 23

Excited to share that I’ll be interning @Microsoft Office of Applied Research this summer, working on reinforcement finetuning with the awesome @soshsihao and @ylongqi. Seattle friends, let’s catch up and chat anything from alignment to inference-time scaling!

689

Sihao Chen@soshsihao · May 24

Join us if you want to work on the next-gen collaborative, socially- intelligent agents!

LLongqi Yang@ylongqi · May 23

🙌We are looking for a full-time research scientist in Microsoft's Office of Applied Research! This person will help lead the development of next-generation AI technologies to support groups of people - not just individuals - in getting their work done.

2.0K

Sihao Chen Retweeted

Tong Chen@tomchen0 · May 13

LLMs naturally memorize some verbatim of pre-training data. We study whether post-training can be an effective way to mitigate unintentional reproduction of pre-training data. 🛠️ No changes to pre-training or decoding 🔥 Training models to latently distinguish between memorized…

16.0K

Sihao Chen@soshsihao · May 6

This is so true. LLM researchers seem to like to "specialize" in either pretraining or post training. Doing intense research on both sides does unlock something.

AAidan Clark@_aidan_clark_ · May 6

No LLM researcher should spent their whole life on one side of the pre/post training divide. The former teaches you what is actually happening, the latter reminds you what actually matters.

132

28.0K

Sihao Chen Retweeted

Taiwei Shi@taiwei_shi · May 2

Want to 𝐜𝐮𝐭 𝐑𝐅𝐓 𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐭𝐢𝐦𝐞 𝐛𝐲 𝐮𝐩 𝐭𝐨 𝟐× and boost performance? 🚀 Meet 𝑨𝒅𝒂𝑹𝑭𝑻 — a lightweight, plug-and-play curriculum learning method you can drop into any mainstream RFT algorithms (PPO, GRPO, REINFORCE). Less compute. Better results. 🧵 1/n

360

361

42.0K

Sihao Chen Retweeted

Bowen Jiang (Lauren)@laurenbjiang · Apr 23

🚀 How well can LLMs know you and personalize your response? Turns out, not so much! Introducing the PersonaMem Benchmark -- 👩🏻‍💻Evaluate LLM's ability to understand evolving persona from 180+ multi-session user-chatbot conversation history 🎯Latest models (GPT-4.1, GPT-4.5,…

4.0K

Sihao Chen@soshsihao · Apr 27

Will be at #NAACL this week. Let's talk if you are interested in RL, agents, and LLM post training in general!

LLongqi Yang@ylongqi · Apr 26

Heading to #NAACL2025 w/ @peizNLP @soshsihao We are hiring full-time scientists on LLM post training, long-context reasoning, agents, and reinforcement finetuning. Please reach out if you are interested in chatting at the conference!

2.0K

Sihao Chen Retweeted

Yu Feng@AnnieFeng6 · Apr 16

#ICLR2025 Oral LLMs often struggle with reliable and consistent decisions under uncertainty 😵‍💫 — largely because they can't reliably estimate the probability of each choice. We propose BIRD 🐦, a framework that significantly enhances LLM decision making under uncertainty. BIRD…

259

184

28.0K

Sihao Chen Retweeted

Taiwei Shi@taiwei_shi · Mar 26

📢 𝐖𝐢𝐥𝐝𝐅𝐞𝐞𝐝𝐛𝐚𝐜𝐤 A large-scale preference dataset built from 𝐫𝐞𝐚𝐥 𝐮𝐬𝐞𝐫 interactions with ChatGPT ✅ 𝟐𝟎𝐤+ preference pairs 🗣️ Built from 𝟏𝐌 chats 🔍 Annotated with 𝐝𝐢𝐚𝐥𝐨𝐠𝐮𝐞 𝐬𝐭𝐚𝐭𝐞, 𝐝𝐨𝐦𝐚𝐢𝐧, 𝐢𝐧𝐭𝐞𝐧𝐭, and more huggingface.co/datasets/micro…

4.0K

Sihao Chen Retweeted

Hongming Zhang@hongming110 · Mar 4

🤖 Tired of slow tree searches on LLMs? 🚀 Check out our latest research on efficient tree search! 🔹 We introduce an upgraded transformer architecture that enables token-level self-reward modeling (TRM). 🔹 On top of that, we developed the Streaming Looking Ahead (SLA)…

393