Logan Engstrom
@logan_engstrom
research @openai
Want state-of-the-art data curation, data poisoning & more? Just do gradient descent! w/ @andrew_ilyas Ben Chen @axel_s_feldmann @wsmoses @aleks_madry: we show how to optimize final model loss wrt any continuous variable. Key idea: Metagradients (grads through model training)

“How will my model behave if I change the training data?” Recent(-ish) work w/ @logan_engstrom: we nearly *perfectly* predict ML model behavior as a function of training data, saturating benchmarks for this problem (called “data attribution”).
Had a great time at @SimonsInstitute talking about new & upcoming work on meta-optimization of ML training tl;dr we show how to compute gradients *through* the training process & use them to optimize training. Immediate big gains on data selection, poisoning, attribution & more!
Want state-of-the-art data curation, data poisoning & more? Just do gradient descent! w/ @andrew_ilyas Ben Chen @axel_s_feldmann @wsmoses @aleks_madry: we show how to optimize final model loss wrt any continuous variable. Key idea: Metagradients (grads through model training)
Very cool work from Logan and the gang! One of these problems, indiscriminate data poisoning, has been one of my favorite mysteries in robustness -- and they did an order of magnitude better than we could previously! Looking forward to checking it out in more detail.
Want state-of-the-art data curation, data poisoning & more? Just do gradient descent! w/ @andrew_ilyas Ben Chen @axel_s_feldmann @wsmoses @aleks_madry: we show how to optimize final model loss wrt any continuous variable. Key idea: Metagradients (grads through model training)
After some very fun years at MIT, I'm really excited to be joining CMU as an assistant professor in Jan 2026! A big (huge!) thanks to my advisors (@aleks_madry @KonstDaskalakis), collaborators, mentors & friends. In the meantime, I'll be a Stein Fellow at Stanford Statistics.
Announcing a deadline extension for the ATTRIB workshop! Submissions are now due September 25th, with an option to submit October 4th if at least one paper author volunteers to be an emergency reviewer. More info here: attrib-workshop.cc
The ATTRIB workshop is back @ NeurIPS 2024! We welcome papers connecting model behavior to data, algorithms, parameters, scale, or anything else. Submit by Sep 18! More info: attrib-workshop.cc Co-organizers: @tolgab0 @logan_engstrom @SadhikaMalladi @_elinguyen @smsampark
Thanks to all who attended our tutorial "Data Attribution at Scale" at ICML (w/ @smsampark @logan_engstrom @kris_georgiev1 @aleks_madry)! We're really excited to see the response to this emerging topic. Slides, notes, ICML video: ml-data-tutorial.org Public recording soon!
Stop by our poster on model-aware dataset selection at ICML! Location/time: 1:30pm Hall C 4-9 #1010 (Tuesday) Paper: arxiv.org/abs/2401.12926 with: @axel_s_feldmann @aleks_madry
At #ICML2024 ? Our tutorial "Data Attribution at Scale" will be to tomorrow at 9:30 AM CEST in Hall A1! I will not be able to make it (but will arrive later that day), but my awesome students @andrew_ilyas @smsampark @logan_engstrom will carry the torch :)
In work w/ @andrew_ilyas @_JenAllen @hannahq_li @aleks_madry we give experimental evidence that users strategize on recommender systems! We find that users react to their (beliefs about) *algorithms* (not just content!) to shape future recs. Paper: arxiv.org/abs/2405.05596 1/8
How is an LLM actually using the info given to it in its context? Is it misinterpreting anything or making things up? Introducing ContextCite: a simple method for attributing LLM responses back to the context: gradientscience.org/contextcite w/ @bcohenwang, @harshays_, @kris_georgiev1