Andrea de Varda

@devarda_a

Postdoc at MIT BCS, interested in language(s) in humans and LMs

Joined March 2022

484Following

361Followers

Pinned

New preprint! 🧠🤖 Brain encoding in 21 languages! biorxiv.org/content/10.110… w/ @saima_mm, @GretaTuckute, and @ev_fedorenko (1/)

139

21.0K

Pinned

Andrea de Varda Retweeted

Byung-Doh Oh@byungdoh · Jun 3

Have reading time corpora been leaked into LM pre-training corpora? Should you be cautious about using pre-trained LM surprisal as a consequence? We identify the longest overlapping token sequences and conclude the leakage is mostly not severe. In Findings of #ACL2025 #ACL2025NLP

3.0K

Andrea de Varda Retweeted

Yevgeni Berzak@whylikethis_ · May 29

👀📖Big news! 📖👀 Happy to announce the release OneStop Eye Movements!🍾🍾 The OneStop dataset is the product of over 6 years of experimental design, data collection and data curation. github.com/lacclab/OneSto…

3.0K

Andrea de Varda Retweeted

Greta Tuckute@GretaTuckute · May 23

What are the organizing dimensions of language processing? We show that voxel responses are organized along 2 main axes: processing difficulty & meaning abstractness—revealing an interpretable, topographic representational basis for language processing shared across individuals.

181

13.0K

Andrea de Varda Retweeted

Jaap Jumelet@JumeletJ · Apr 7

✨New paper ✨ Introducing 🌍MultiBLiMP 1.0: A Massively Multilingual Benchmark of Minimal Pairs for Subject-Verb Agreement, covering 101 languages! We present over 125,000 minimal pairs and evaluate 17 LLMs, finding that support is still lacking for many languages. 🧵⬇️

8.0K

Andrea de Varda Retweeted

Badr AlKhamissi@bkhmsi · Mar 5

🚨 New Preprint!! LLMs trained on next-word prediction (NWP) show high alignment with brain recordings. But what drives this alignment—linguistic structure or world knowledge? And how does this alignment evolve during training? Our new paper explores these questions. 👇🧵

281

213

25.0K