Pietro Lesci
@pietro_lesci
Final-year PhD student @cambridge_uni. Causality & language models | ex @bainandcompany @ecb @amazonscience. Passionate musician, professional debugger.
Super excited and grateful that our paper received the best paper award at #ACL2024 🎉 Huge thanks to my fantastic co-authors — @clara__meister, Thomas Hofmann, @vlachos_nlp, and @tpimentelms — the reviewers that recommended our paper, and the award committee #ACL2024NLP
Happy to share our #ACL2024 paper: "Causal Estimation of Memorisation Profiles" 🎉 Drawing from econometrics, we propose a principled and efficient method to estimate memorisation using only observational data! See 🧵 +@clara__meister, Thomas Hofmann, @vlachos_nlp, @tpimentelms
I will also be sharing more Tokenisation work from @cambridgenlp at TokShop– this time on Tokenisation Bias by @pietro_lesci and @vlachos_nlp, @clara__meister, Thomas Hofmann and @tpimentelms.
I'm in Vancouver for TokShop @tokshop2025 at ICML @icmlconf to present joint work with my labmates, @tweetByZeb, @pietro_lesci and @julius_gulius, and Paula Buttery. Our work, ByteSpan, is an information-driven subword tokenisation method inspired by human word segmentation.
Mechanistic interpretability often relies on *interventions* to study how DNNs work. Are these interventions enough to guarantee the features we find are not spurious? No!⚠️ In our new paper, we show many mech int methods implicitly rely on the linear representation hypothesis🧵
Looking forward to this year's edition! With great speakers: Ryan McDonald @yulanhe Vlad Niculae @anas_ant @raquel_dmg @annargrs @preslav_nakov @mohitban47 @eunsolc @MarieMarneffe !
📢 10 Days Left to apply for the AthNLP - Athens Natural Language Processing Summer School! ✍ Get your applications in before June 15th! athnlp.github.io/2025/cfp.html
If you use LLMs, tokenisation bias probably affects you: * Text generation: tokenisation bias ⇒ length bias 🤯 * Psycholinguistics: tokenisation bias ⇒ systematically biased surprisal estimates 🫠 * Interpretability: tokenisation bias ⇒ biased logits 🤔
A string may get 17 times less probability if tokenised as two symbols (e.g., ⟨he, llo⟩) than as one (e.g., ⟨hello⟩)—by an LM trained from scratch in each situation! Our #acl2025nlp paper proposes an observational method to estimate this causal effect! Longer thread soon!
A string may get 17 times less probability if tokenised as two symbols (e.g., ⟨he, llo⟩) than as one (e.g., ⟨hello⟩)—by an LM trained from scratch in each situation! Our #acl2025nlp paper proposes an observational method to estimate this causal effect! Longer thread soon!
🔥 We teach LLMs to say how confident they are on-the-fly during long-form generation. 🤩No sampling. No slow post-hoc methods. Not limited to short-form QA! ‼️Just output confidence in a single decoding pass. ✅Better calibration! 🚀 20× faster runtime. arXiv:2505.23912 👇
If you're finishing your camera-ready for ACL (#acl2025nlp) or ICML (#icml2025 ) and want to cite co-first authors more fairly, I just made a simple fix to do this! Just add $^*$ to the authors' names in your bibtex, and the citations should change :) github.com/tpimentelms/ac…
Inception Lab and Gemini Diffusion are hot these days. Just published a blog post on Diffusion Language Models! 🚀 Exploring how diffusion (yes, the image model kind) can be used for text generation. Check it out👇 spacehunterinf.github.io/blog/2025/diff… #NLP #LLMs #DiffusionModels
📢 @aclmeeting notifications have been sent out, making this the perfect time to finalize your commitment. Don't miss the opportunity to be part of the workshop! 🔗 Commit here: openreview.net/group?id=aclwe… 🗓️ Deadline: May 20, 2025 (AoE) #ACL2025 #NLProc
The call for papers for the 8th @FEVERworkshop at #ACL is out: fever.ai/workshop.html Deadline for is on May 19th! And if you have a paper already reviewed in ARR, you can commit it until June 9th!