Daria Lioubashevski
@DariaLioub
CS MSc student @CseHuji | Student Researcher @Google
📢Paper release📢 What computation is the Transformer performing in the layers after the top-1 becomes fixed (a so called "saturation event")? We show that the next highest-ranked tokens also undergo saturation *in order* of their ranking. Preprint: arxiv.org/abs/2410.20210 1/4

Ever wondered how Transformers refine their top-k predictions over their layers? 📊 Is there an order to the madness? Come find out at my poster presentation tommorow at @icmlconf 📍East Exhibition Hall E-2512, 11:00-13:30

I'll be at #ICML2025 next week, would love to chat about mechanistic interpretability, neuroAI or cognitive comp neuroscience. BTW if you're already in Vancouver, highly recommend the Aquarium! (second image is me with jet lag 😂)


Can (A)I change your mind? New study finds LLMs can be as persuasive as humans — even in real-world, ecological conversations on controversial topics. arxiv.org/abs/2503.01844 #AI #LLM #Persuasion #CogSci2025 @cogsci_soc @timnaWK @GoldsteinYAriel @yanivdover @morankor
Accepted at #icml2025🥳 Camera ready version (with newer models like Llama-3 and Qwen-Audio) coming soon!
📢Paper release📢 What computation is the Transformer performing in the layers after the top-1 becomes fixed (a so called "saturation event")? We show that the next highest-ranked tokens also undergo saturation *in order* of their ranking. Preprint: arxiv.org/abs/2410.20210 1/4
✨ Ever tried generating an image from a prompt but ended up with unexpected outputs? Check out our new paper #FollowTheFlow - tackling T2I issues like bias, failed binding, and leakage from the textual encoding side! 💼🔍 arxiv.org/pdf/2504.01137 guykap12.github.io/guykap12.githu… 🧵[1/7]
New Preprint 🎉 LLM self-assessment unlocks efficient decoding ✅ Our Confidence-Informed Self-Consistency (CISC) method cuts compute without losing accuracy. We also rethink confidence evaluation & contribute to the debate on self-verification. arxiv.org/abs/2502.06233 1/8👇
Very excited to share our new paper published in Nature Communications @NatureComms (link below). This work is part of my PhD research under the supervision of @roireichart (Technion), @HassonUri (@HassonLab), and @ArielYGoldstein, in collaboration with @YoavMeiri.
Look at the CRAZY domain gap we found in summarization datasets: while English resources are diverse, other languages are mostly restricted to news. Presenting our survey following 130+ datasets in 100+ languages! Explore: github.com/edahanoam/Awes… @GabiStanovsky, @nlphuji 1/6
In which layers does information flow from previous tokens to the current token? Presenting our new @BlackboxNLP paper: “Attend First, Consolidate Later: On the Importance of Attention in Different LLM Layers” arxiv.org/abs/2409.03621 1/n
Hallucinations are a subject of much interest, but how much do we know about them? In our new paper, we found that the internals of LLMs contain far more information about truthfulness than we knew! 🧵 Project page >> llms-know.github.io Arxiv >> arxiv.org/abs/2410.02707
📢 New paper alert! 📢 Thrilled to announce `Instructed to Bias: Instruction-Tuned Language Models Exhibit Emergent Cognitive Bias'. Do instruction tuning and RLHF amplify biases in LMs? 🧵 Check it out arxiv.org/abs/2308.00225 W @boknilev @GabiStanovsky and N. Rosenfeld.