Adam Dziedzic
@adam_dziedzic
I'm a researcher, software developer, systems designer & engineer. I have a passion for machine/deep learning, databases, technology, traveling, sport & music.
Are modern large language models (LLMs) vulnerable to privacy attacks that can determine if given data was used for training? Models and dataset are quite large, what should we even expect? Our new paper looks into this exact question. 🧵 (1/10)
Presenting two papers at the MemFM workshop at ICML! Both touch upon how near duplicates (and beyond) in LLM training data contribute to memorization. - arxiv.org/pdf/2405.15523 - arxiv.org/pdf/2506.20481 @_igorshilov @yvesalexandre
Our panelists at the workshop on the impact of memorization on foundation models @icmlconf are discussing about the gaps between academia and industry when it comes to memorization. Don‘t miss it and join us!
Giving an invited talk at the #MemFM workshop at ICML in 10 min. Room 223. I discuss why privacy & safety communities need to talk a lot more to each other, by discussing my own journey in quantifying, & erasing memorization, unsafe behaviours. PS: My first ever invited talk!🥹
Come and check the talk by @pratyushmaini on "Unlocking Post-hoc Dataset Inference with Synthetic Data" on the Data in Generative Models @ ICML 2025 Workshop. The talk is at 2:15 pm. This is joint work with Bihe Zhao, @fraboeni We are West Ballroom A.

The amazing A Feder Cooper talking about nuances of memorization and copyright at the memorization workshop and how copying is not just generation, the model itself is also “a copy” of training data.
We’re happy to have such a great audience, incited, and contributed talks at our #MemFM workshop at #ICML‘25. And everyone is excitedly waiting for Feder Cooper‘s upcoming talk on #copyrighted data in #foundation #models. ⏲️Happening in 10 minutes!!
Big kudos also to @niloofar_mire for leading us through an exciting day full of #memorization at our #MemFM workshop at #ICML!
🧠📚Join us for Vitaly Feldman‘s explanation on the relevant question of: „Why do training algorithms memorize *irrelevant* training data?“ #memorization #workshop #MemFM #ICML2025
Join us in west building room 223 for the #Memorization workshop!!
Our workshop #MemFM at #ICML‘25 is going to start in 15 minutes in West Meeing room 223-224 in the #Vancouver Convention Center. We’re looking much forward to hosting you for interesting talks. We also start the day with #breakfast snacks. icml2025memfm.github.io
🚨 Join us at ICML 2025 for the Workshop on Unintended Memorization in Foundation Models (MemFM)! 🚨 📅 Saturday, July 19 🕓 08:25 am - 5 pm 📍 West Meeting Room 223-224, Vancouver Convention Centre Link: icml2025memfm.github.io Let's build more Trustworthy Foundation Models!
