Julia Kempe
@KempeLab
Silver Professor at NYU Courant and CDS, Research Scientist at FAIR Research in Machine Learning, past in Quantum Computing & Finance. Posts my own.
How would you make an LLM "forget" the concept of dog — or any other arbitrary concept? 🐶❓ We introduce SAMD & SAMI — a novel, concept-agnostic approach to identify and manipulate attention modules in transformers.
❓How to balance negative and positive rewards in off-policy RL❓ In Asymmetric REINFORCE for off-Policy RL, we show that giving less weight to negative rewards is enough to stabilize off-policy RL training for LLMs! 💪 (1/8) Paper: arxiv.org/abs/2506.20520
Congrats to 37 CDS researchers — faculty, postdocs, and PhD students — who had papers accepted to ICLR 2025, including Spotlighted work by @KempeLab, @feeelix_feng, @andrewgwils, @KuangYilun, and @JianyuZhang8. Full list: nyudatascience.medium.com/cds-researcher…
Check out our poster tmr at 10am at the ICLR Bidirectional Human-AI Alignment workshop! We cover how on-policy preference sampling can be biased and our optimal response sampling for human labeling. @NYUDataScience @AIatMeta @KempeLab @YaqiDuanPKU x.com/feeelix_feng/s…
You think on-policy sampling gives the best reward models? Think again! 🔥 Our finding: Even with on-policy data, reward models misalign with policy optimization goals! Introducing PILAF—strategic sampling that fixes this fundamentally. (1/11)
Here is to a next generation of AI-literate kids! International AI Olympiad ioai-official.org ML Researchers, you might appreciate the impressive syllabus. Do we have all the chops our kids are expected to have :) ? ioai-official.org/wp-content/upl…
If in Singapore next week, come by our #ICLR2025 Spotlight poster for our recent study at @KempeLab unveiling how data pruning promotes implicit bias in datasets and proposing a method (DRoP) that does exactly the opposite: iclr.cc/virtual/2025/p…
We refused to cite the paper due to severe misconduct of the authors of that paper: plagiarism of our own prior work, predominantly AI-generated content (ya, the authors plugged our paper into an LLM and generated another paper), IRB violations, etc. Revealed during a long…
Jesus Christ... openreview.net/forum?id=et5l9…
It is a real delight to work with @dohmatobelvis and I encourage every student in search of excellent and rigorous mentorship to apply to his group!
Papers accepted at @iclr_conf 2025: - An Effective Theory of Bias Amplification arxiv.org/abs/2410.17263 - Pitfalls of Memorization arxiv.org/abs/2412.07684 - Strong Model Collapse arxiv.org/abs/2410.04840 - Beyond Model Collapse arxiv.org/abs/2406.07515 With @KempeLab,…
You think on-policy sampling gives the best reward models? Think again! 🔥 Our finding: Even with on-policy data, reward models misalign with policy optimization goals! Introducing PILAF—strategic sampling that fixes this fundamentally. (1/11)