Lucas Bandarkar

@LucasBandarkar

PhD student @uclaNLP — ML / #NLProc / multilingual @AIatMeta

Los Angeles, CA

Joined November 2013

301Following

249Followers

Pinned

Lucas Bandarkar@LucasBandarkar · Oct 4

Cross-lingual transfer can be as easy as swapping model layers between LLMs! 🔀 Our model merging method can compose math and language skills by swapping top&bottom layers from a SFT’d target language expert into a math expert without retraining arxiv.org/pdf/2410.01335 🧵: [1/3]

LucasBandarkar's tweet image. Cross-lingual transfer can be as easy as swapping model layers between LLMs! 🔀

Our model merging method can compose math and language skills by swapping top&amp;bottom layers from a SFT’d target language expert into a math expert without retraining arxiv.org/pdf/2410.01335 🧵: [1/3]

8.0K

Pinned

Lucas Bandarkar@LucasBandarkar · Apr 21

I’ll be at #ICLR2025 this week to present this Spotlight ✨ paper on post-hoc modularization-then-merging that enables a surprising amount of cross-lingual transfer Super excited 😊

LLucas Bandarkar@LucasBandarkar · Oct 4

384

Pinned

Lucas Bandarkar@LucasBandarkar · Feb 12

This is truly awesome, they use recurrent blocks (similar to diffusion models) to have an LLM that can think "longer" if extra reasoning required. concept is totally parallel to speculative decoding / early exiting

JJonas Geiping@jonasgeiping · Feb 10

Ok, so I can finally talk about this! We spent the last year (actually a bit longer) training an LLM with recurrent depth at scale. The model has an internal latent space in which it can adaptively spend more compute to think longer. I think the tech report ...🐦‍⬛

244

Pinned

Lucas Bandarkar Retweeted

Raj Dabre@prajdabre · Jan 7

Paper #3: Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models What can we do in model merging when we want to transfer task performance from one language to another? @LucasBandarkar got y'all covered! Link: arxiv.org/abs/2410.01335

9.0K

Lucas Bandarkar@LucasBandarkar · May 19

this paper reveals a whole number of heuristic-style errors in dense retrievers (i.e. for RAG) accepted at ACL, congrats @mohsen_fayyaz

MMohsen Fayyaz@mohsen_fayyaz · Mar 22

new paper! 🌱 Collapse of Dense Retrievers We uncover major vulnerabilities in dense retrievers like Contriever, showing they favor: 📌 Shorter docs 📌 Early positions 📌 Repeated entities 📌 Literal matches ...all while ignoring the answer's presence! huggingface.co/datasets/mohse…

417

Lucas Bandarkar Retweeted

Yunzhi Yao@yyzTodd · May 14

🚨 New Blog Drop! 🚀 "Reflection on Knowledge Editing: Charting the Next Steps" is live! 💡 Ever wondered why knowledge editing in LLMs still feels more like a lab experiment than a real-world solution? In this post, we dive deep into where the research is thriving — and where…

5.0K

Lucas Bandarkar Retweeted

Dieuwke Hupkes@_dieuwke_ · Apr 15

Next, we studied the effect of the question language and found that generally, performance is higher when asked in the 'native' language. In the plot, *mother tongue effect* = performance when question is asked in the language to which it is relevant - performance in English

129

Lucas Bandarkar@LucasBandarkar · Apr 16

This is seriously cool — HQ dataset that can open up all sorts of studies on the cross-lingual {local} knowledge transfer in LLMs

DDieuwke Hupkes@_dieuwke_ · Apr 15

So happy our new multilingual benchmark MultiLoKo is finally out (after some sweat and tears!) arxiv.org/abs/2504.10356 Multilingual eval for LLMs... could be better, and I hope MultiLoKo will help fill some gaps in it + help study design choices in benchmark design @metaai

177

Lucas Bandarkar Retweeted

Tu Vu@tuvllms · Mar 26

🚨 New paper 🚨 Excited to share my first paper w/ my PhD students!! We find that advanced LLM capabilities conferred by instruction or alignment tuning (e.g., SFT, RLHF, DPO, GRPO) can be encoded into model diff vectors (à la task vectors) and transferred across model…

437

274

41.0K

Lucas Bandarkar Retweeted

Tanmay Parekh@tparekh97 · Feb 27

🚨Selecting the best prompting strategy for LLMs is challenging, and ensembling is inefficient. We introduce DyPlan 🧠, a dynamic framework that teaches LLMS to use internal knowledge to pick the best strategy. It cuts token/retrieval costs by 7-13% and boosts F1 by 11-32%. (1/N)

5.0K

Lucas Bandarkar Retweeted

Daniel Israel@danielmisrael · Feb 13

“That’s one small [MASK] for [MASK], a giant [MASK] for mankind.” – [MASK] Armstrong Can autoregressive models predict the next [MASK]? It turns out yes, and quite easily… Introducing MARIA (Masked and Autoregressive Infilling Architecture) arxiv.org/abs/2502.06901

4.0K

Lucas Bandarkar Retweeted

Hritik Bansal@hbXNov · Nov 27

Open LLM evals often face data contamination and bias concerns. Private curators🚪(@scale_AI) address this with curated data and experts evaluations👲 We argue that this shift poses new risks including financial incentives 💸 and eval bias☠️!! w/ @pratyushmaini

17.0K

Lucas Bandarkar@LucasBandarkar · Jan 7

Belebele extended to speech for 74 (!!) languages (this project also extended Fleurs to more languages)

MMarta R. Costa-jussa@costajussamarta · Dec 19

We introduce the first highly multilingual speech and American Sign Language (ASL) comprehension dataset by extending BELEBELE. arxiv.org/abs/2412.08274 Freely available in Github github.com/facebookresear…

322

Lucas Bandarkar Retweeted

Fabian David Schmidt@fdschmidt · Dec 12

📣Happy to (pre-)release my Fleurs-SLU benchmark to evaluate massively multilingual spoken language understanding on SIB & Belebele. Work done at @Mila_Quebec with @davlanade @gg42554 @licwu Datasets: huggingface.co/datasets/WueNL… huggingface.co/datasets/WueNL… Details to follow👇

4.0K

Lucas Bandarkar Retweeted

Shivalika Singh@singhshiviii · Dec 5

We also translate MMLU to build an extensive evaluation set in 42 languages. We further engage with professional and community annotators to improve quality of MMLU translations – we introduce this as Global-MMLU🌍

426

Lucas Bandarkar@LucasBandarkar · Dec 5

This dataset subsamples MMLU to limit questions that are too Western-centric and they then tanslate to 42 languages. Wow @CohereForAI with two big multilingual benchmarks released this week. Great to know I will no longer have to rely on machine-translated MMLU

SShivalika Singh@singhshiviii · Dec 5

Today, we’re excited to share Global-MMLU 🌍: a multilingual LLM benchmark covering MMLU translations in 42 languages -- combined with improved quality through human curation and extensive metadata on what questions are culturally sensitive 🗽

2.0K