Dr. Karen Ullrich
@karen_ullrich
Research scientist at FAIR NY + collab w/ Vector Institute. ❤️ Machine Learning + Information Theory. Previously, PhD at UoAmsterdam, intern at DeepMind + MSRC.
#Tokenization is undeniably a key player in the success story of #LLMs but we poorly understand why. I want to highlight progress we made in understanding the role of tokenization, developing the core incidents and mitigating its problems. 🧵👇
Our work got accepted to #ICLR2025 @iclr_conf! Learn more about tokenization bias and how to convert your tokenized LLM to byte-level LLM without training! See you in Singapore! Check out the code here: github.com/facebookresear…
🎉Our paper just got accepted to #ICLR2025! 🎉 Byte-level LLMs without training and guaranteed performance? Curious how? Dive into our work! 📚✨ Paper: arxiv.org/abs/2410.09303 Github: github.com/facebookrese...
📢 My team at Meta is hiring visiting PhD students from CMU, UW, Berkeley, and NYU! We study core ML, optimization, amortization, transport, flows, and control for modeling and interacting with complex systems. Please apply here and message me: metacareers.com/jobs/395022364…
Excited to release EvalGIM, an easy-to-use evaluation library for generative image models. EvalGIM ("EvalGym") unifies metrics, datasets, & visualizations, is customizable & extensible to new benchmarks, & provides actionable insights. Check it out! github.com/facebookresear…
Thursday is busy: 9-11am I will be at the Meta AI Booth 12.30-2pm Mission Impossible: A Statistical Perspective on Jailbreaking LLMs (neurips.cc/virtual/2024/p…) OR End-To-End Causal Effect Estimation from Unstructured Natural Language Data (neurips.cc/virtual/2024/p…)
For those into jailbreaking LLMs: our poster "Mission Impossible" today shows the fundamental limits of LLM alignment - and improved ways to go about it, nonetheless. With @karen_ullrich & Jingtong Su #2302 11am - 2pm Poster Session 3 East @NYUDataScience @AIatMeta #NeurIPS2024
Starting with Fei-Fei Li’s talk 2.30, after that I will mostly be meeting people and wonder the poster sessions.
Folks, I am posting my NeurIPS schedule daily in hopes to see folks, thanks @tkipf the idea ;) 11-12.30 WiML round tables 1.30-4 Beyond Decoding, Tutorial
Researchers at CDS and @AIatMeta prove vulnerabilities in AI language models are unavoidable, but introduce E-RLHF, a method to reduce jailbreaking. CDS’ Jingtong Su, @KempeLab, and @karen_ullrich push AI safety forward. Full details: nyudatascience.medium.com/ai-language-mo…