Angelica Chen

@_angie_chen

Gemini training @ GDM PhD from @NYUDataScience, previously @Princeton 🐅 angie-chen at 🦋 Interested in LLMs, pastries, and running

New York, NY

Joined February 2016

459Following

1KFollowers

Pinned

Angelica Chen@_angie_chen · May 30, 2024

New work w/@sadhikamalladi, @lilyhzhang, @xinyichen2, @QiuyiRichardZ, Rajesh Ranganath, @kchonyc: Contrary to conventional wisdom, RLHF/DPO does *not* produce policies that mostly assign higher likelihood to preferred responses than to less preferred ones.

_angie_chen's tweet image. New work w/@sadhikamalladi, @lilyhzhang, @xinyichen2, @QiuyiRichardZ, Rajesh Ranganath, @kchonyc: Contrary to conventional wisdom, RLHF/DPO does *not* produce policies that mostly assign higher likelihood to preferred responses than to less preferred ones.

239

155

51.0K

Pinned

Angelica Chen Retweeted

NYU Center for Data Science@NYUDataScience · Jan 15

CDS PhD student @_angie_chen presents LLOME, using LLMs to optimize synthetic sequences with potential applications for drug design. Co-led by @samuel_stanton_ & @nc_frey and with insights from @kchonyc, @RichBonneauNYC, and others at @PrescientDesign. nyudatascience.medium.com/language-model…

3.0K

Angelica Chen Retweeted

Jason Weston@jaseweston · Jun 30

🌉 Bridging Offline & Online RL for LLMs 🌉 📝: arxiv.org/abs/2506.21495 New paper shows on verifiable & non-verifiable tasks: - Online DPO & GRPO give similar performance. - Semi-online (iterative) DPO with sync every s steps (more efficient!) works very well also. - Offline DPO…

455

318

65.0K

Angelica Chen Retweeted

Aran Komatsuzaki@arankomatsuzaki · Jun 27

Bridging Offline and Online Reinforcement Learning for LLMs Investigates the effectiveness of RL for finetuning LLMs when transitioning from offline to semi-online to fully online regimes for both verifiable and nonverifiable tasks.

285

182

25.0K

Angelica Chen Retweeted

Vishakh Padmakumar@vishakh_pk · Apr 29

What does it mean for #LLM output to be novel? In work w/ @jcyhc_ai, @JanePan_, @valeriechen_, @hhexiy we argue it needs to be both original and high quality. While prompting tricks trade one for the other, better models (scaling/post-training) can shift the novelty frontier 🧵

7.0K

Angelica Chen Retweeted

Jason Weston@jaseweston · Jan 31

🚨 Diverse Preference Optimization (DivPO) 🚨 SOTA LLMs have model collapse🫠: they can't generate diverse creative writing or synthetic data 🎨 DivPO trains for both high reward & diversity, vastly improving variety with similar quality. Paper 📝: arxiv.org/abs/2501.18101 🧵below

344

235

45.0K

Angelica Chen@_angie_chen · Dec 14

I saw a slide circulating on social media last night while working on a deadline. I didn’t comment immediately because I wanted to understand the full context before speaking. After learning more, I feel compelled to address what I witnessed during an invited talk at NeurIPS 2024…

XXin Eric Wang@xwang_lk · Dec 14

It is just so sad that the #NeurIPS2024 main conference ended with such a racist remark by a faculty when talking about ethics. How ironic! I also want to commend the Chinese student who spoke up right on spot. She was respectful, decent, and courageous. Her response was…

179

1.0K

153.0K

Angelica Chen@_angie_chen · Dec 9

I’ll be at NeurIPS this week! Presenting at the Thursday 4:30pm poster session and giving a spotlight talk at the AIDrugX workshop on Sunday. Also, I’ve finally joined 🦋. Come find me, both at NeurIPS and on 🦋! ☺️

_angie_chen's tweet image. I’ll be at NeurIPS this week! Presenting at the Thursday 4:30pm poster session and giving a spotlight talk at the AIDrugX workshop on Sunday. Also, I’ve finally joined 🦋. Come find me, both at NeurIPS and on 🦋! ☺️

114

11.0K

Angelica Chen Retweeted

Nathan C. Frey@nc_frey · Dec 9

Two @NeurIPSConf workshop spotlight talks from our lab this year! @amyxlu will present on all-atom protein generation from sequence-only inputs at MLSB and @_angie_chen will present on LLMs as highly-constrained biophysical sequence optimizers at AIDrugX

2.0K

Angelica Chen Retweeted

Richard Pang@yzpang_ · Dec 9

🚨🔔Foundational graph search task as testbed: for some distribution, transformers can learn to search (100% acc). We interpreted their algo!! But as graph size ↑, transformers struggle. Scaling up # params does not help; CoT does not help. 1.5 years of learning in 10 pages!

114

16.0K

Angelica Chen@_angie_chen · Nov 25

Check out Sadhika's talk tomorrow! She'll be talking about our paper "Preference Learning Algorithms Do Not Learn Preference Rankings" (arxiv.org/abs/2405.19534) as well as some cool very cool follow-up work :)

SSadhika Malladi@SadhikaMalladi · Nov 22

I will be giving a talk on "Failure Modes of Preference Learning" through the AI Tinkerers club on 11/26 at 12pm ET. I gave this talk at a few universities recently, and I'm excited to share it with the broader community! paperclub.aitinkerers.org/p/join-paper-c…

1.0K

Angelica Chen@_angie_chen · Oct 31

LLMs are clearly very general interfaces, but we weren't sure they could be made precise enough for protein design to really work. With active data collection, the right preference tuning, and test-time scaling (or just search as we used to call it) it looks like yes!

NNathan C. Frey@nc_frey · Oct 31

LLMs are highly constrained biological sequence optimizers. In new work led by @_angie_chen & @samuel_stanton_ , we show how to drive an active learning loop for protein design with an LLM. 1/

2.0K

Angelica Chen Retweeted

Nathan C. Frey@nc_frey · Oct 31

LLMs are highly constrained biological sequence optimizers. In new work led by @_angie_chen & @samuel_stanton_ , we show how to drive an active learning loop for protein design with an LLM. 1/

127

15.0K

Angelica Chen Retweeted

Naomi Saphra@nsaphra · Oct 18

What makes some LM interpretability research “mechanistic”? In our new position paper in @BlackboxNLP, @sarahwiegreffe and I argue that the practical distinction was never technical, but a historical artifact that we should be—and are—moving past to bridge communities.

336

181

92.0K

Angelica Chen@_angie_chen · Oct 11

10/10 nyc aurora 😍 @nymetrowx

281

79.0K

Angelica Chen@_angie_chen · Jul 17, 2024

Be sure to stop by Angie's oral presentation and our poster on our preference learning work (arxiv.org/abs/2405.19534) at the MHFAIA workshop at ICML! We'll also be presenting this poster at the Theoretical Foundations of Foundation Models (TF2M) workshop :)

AAngelica Chen@_angie_chen · Jul 16, 2024

I'll be at @icmlconf next week! Giving a plenary talk at the HiLD workshop and an oral on our recent paper (arxiv.org/abs/2405.19534) at the MHFAIA workshop! Pls reach out to chat if you're also interested in any of these topics! 😊

2.0K

Angelica Chen@_angie_chen · Jul 16, 2024

Self-rewarding LMs at #icml2024 ! Thru iterative DPO (w/ a small amount of seed data), LLM instruction following ↑ (AlpacaEval 2.0, human, MT-bench) & reward modeling ↑ (corr w human rankings). @jingxu_ml will be presenting in Vienna (Tues 7/23 11:30am); please stop by! (1/2)

JJason Weston@jaseweston · Jan 19, 2024

🚨New paper!🚨 Self-Rewarding LMs - LM itself provides its own rewards on own generations via LLM-as-a-Judge during Iterative DPO - Reward modeling ability improves during training rather than staying fixed ...opens the door to superhuman feedback? arxiv.org/abs/2401.10020 🧵(1/5)

7.0K