Javier Abad Martinez
@JavierAbadM
PhD Student @ETH_AI_Center | Interested in AI Safety, Privacy & Causal Inference
(1/5) LLMs risk memorizing and regurgitating training data, raising copyright concerns. Our new work introduces CP-Fuse, a strategy to fuse LLMs trained on disjoint sets of protected material. The goal? Preventing unintended regurgitation 🧵 Paper: arxiv.org/pdf/2412.06619
Presenting our work at #ICLR this week! Come by the poster or oral session to chat about copyright protection and AI/LLM safety 📌 𝐏𝐨𝐬𝐭𝐞𝐫: Friday, 10 a.m. – 12.30 p.m. | Booth 537 📌 𝐎𝐫𝐚𝐥: Friday, 3.30 – 5 p.m. | Room Peridot @FraPintoML @DonhauserKonst @FannyYangETH
LLMs accidentally spitting out copyrighted content? We’ve got a fix. Our paper on CP-Fuse—a method to prevent LLMs from regurgitating protected data—got accepted as an Oral at #ICLR2025! 👇Check it out! 📄 arxiv.org/pdf/2412.06619 🤖 github.com/jaabmar/cp_fuse
Introducing AI 𝐅𝐨𝐫𝐦𝐚𝐭𝐭𝐢𝐧𝐠 𝐢𝐧 𝐄𝐱𝐜𝐞𝐥 🤯 @TracelightAI Watch Tracelight format 908 cells in just 39 seconds. Right inside of Excel. 👀 Excel formatting is a huge waste of human talent. We've finally fixed it. Tracelight understands your financial model's existing…
Register now (first-come first-served) for the "Math of Trustworthy ML workshop" at #LagoMaggiore, Switzerland, Oct 12-16 this year, with a great speaker lineup and the opportunity to present your work as a poster session or contributed talk. Details @ mmlworkshop.ethz.ch
What if learning and unlearning happen simultaneously, with unlearning requests between updates? Check out our work on online learning and unlearning-algorithms with certified unlearning, low computational overhead and almost no additional regret. arxiv.org/abs/2505.08557
Advertising an Open Postdoc position in learning theory/ privacy/ robustness/ unlearning or any related topics with me and others in University of Copenhagen, Denmark. If you think you would be a good candidate, send me an email amartya18x.github.io/hiring/
Presenting 2 posters today at ICLR. Come check them out! 10am ➡️ #502: Scalable Extraction of Training Data from Aligned, Production Language Models 3pm ➡️ #324: Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI
Landed in Singapore for #ICLR—excited to see old & new friends! I’ll be presenting: 📌 RAMEN @ Main Conference on Saturday 10 am (@JavierAbadM @yixinwang_ @FannyYangETH) 📌 Causal Lifting @ XAI4Science Workshop on Sunday (@riccardocadeii @ilkerdemirel_ @FrancescoLocat8 )
Very shortly at @RealAAAI , @alexandrutifrea and I will be giving a Tutorial on the impact of Quality and availability of labels and data for Privacy, Fairness, and Robustness of ML algorithms See here amartya18x.github.io/files/Tutorial… @MLSectionUCPH @DIKU_Institut @GoogleDeepMind
LLMs accidentally spitting out copyrighted content? We’ve got a fix. Our paper on CP-Fuse—a method to prevent LLMs from regurgitating protected data—got accepted as an Oral at #ICLR2025! 👇Check it out! 📄 arxiv.org/pdf/2412.06619 🤖 github.com/jaabmar/cp_fuse
(1/5) LLMs risk memorizing and regurgitating training data, raising copyright concerns. Our new work introduces CP-Fuse, a strategy to fuse LLMs trained on disjoint sets of protected material. The goal? Preventing unintended regurgitation 🧵 Paper: arxiv.org/pdf/2412.06619
Eager to hear feedback from anyone who applies causal inference about this recent work with this amazing group of people @pdebartols @JavierAbadM @Guanbo17 @DonhauserKonst @RayDuch and Issa Dahabreh.
Looking for a more efficient way to estimate treatment effects in your randomized experiment? We introduce H-AIPW: a novel estimator that combines predictions from multiple foundation models with real experimental data. arxiv.org/abs/2502.04262
Looking for a more efficient way to estimate treatment effects in your randomized experiment? We introduce H-AIPW: a novel estimator that combines predictions from multiple foundation models with real experimental data. arxiv.org/abs/2502.04262
Excited to present at #NeurIPS2024 our work on robust mixture learning! How hard is mixture learning when (a lot of) outliers are present? We show that it's easier than it seems! Join us at the poster session (Wed, 16:30 PT, West Ballroom A-D #5710).
Thrilled to share our 8 conference paper contributions to @icmlconf 2024 next week. Congrats to our doctoral fellows, postdoctoral fellows, and staff members involved: @afra_amini, @Manish8Prajapat, @JavierAbadM, @DonhauserKonst, @desariky, @ImanolSchlag, D. Dmitriev, E. Bamas
Come to our AISTATS poster (#96) this afternoon (5-7pm) to learn more about hidden confounding!
Worried that hidden confounding stands in the way of your analysis? We propose a new strategy when a small RCT is available: quantify the confounding strength and make decisions accordingly. arxiv.org/abs/2312.03871 With @JavierAbadM, @DonhauserKonst & @FannyYangETH. 🧵(1/7)