Lucas Caccia
@LucasPCaccia
Sr Researcher @ MSR Montréal. PhD from MILA / McGill
RAG and in-context learning are the go-to approaches for integrating new knowledge into LLMs, making inference very inefficient We propose instead 𝗞𝗻𝗼𝘄𝗹𝗲𝗱𝗴𝗲 𝗠𝗼𝗱𝘂𝗹𝗲𝘀 : lightweight LoRA modules trained offline that can match RAG performance without the drawbacks
If you are working on merging / MoEfication of models, we wrote a survey mapping out the current research landscape. Please check it out :)
We just released our survey on "Model MoErging", But what is MoErging?🤔Read on! Imagine a world where fine-tuned models, each specialized in a specific domain, can collaborate and "compose/remix" their skills using some routing mechanism to tackle new tasks and queries! 🧵👇…
Great work led by our past intern Samin. TL;DR sparse masks are a great PEFT method + they merge well!
Happy to share our paper "Exploring Sparse Adapters for Scalable Merging of Parameter-Efficient Experts" is accepted at #COLM 2025! -paper: arxiv.org/abs/2507.07140 - authors: @zhansu9 @kim__minseon Oleksiy @OhibRiyasat @TheEsraaSaleh Doina Precup @LucasPCaccia @murefil
CFP of the Wordplay 2025 (EMNLP) is live! wordplay-workshop.github.io
Announcing the 5th Wordplay Workshop at EMNLP 2025 (Suzhou, China). We are co-organizing the CPDC Challenge (total prize value USD 20K!!!), the warm-up round is starting now! wordplay-workshop.github.io
If you are looking to explore LLMs for debugging, please check this out!
Developers spend a lot of time debugging code. Learn how debug-gym can equip AI agents to help, enabling them to set breakpoints, navigate the codebase, and print runtime variable values on demand, so they better understand the code and its execution flow: msft.it/6017qF6RT
We are looking for interns to work on LLM modularization, please consider applying 🚀
We have few intern positions open in our ML team @ MSR Montreal, come work with @Cote_Marc @kim__minseon @LucasPCaccia @mathe_per @ericxyuan on reasoning, interactive envs/coding and LLM modularization.. 🤯 @mathe_per and I will also be at #NeurIPS2024 so we can chat about this…
If you're interesting in MoErging methods, here's an easy tutorial to get you started!
Explore zero-shot routing of parameter-efficient experts with Phatgoose arxiv.org/abs/2402.05859 and Arrow arxiv.org/abs/2405.11157 w. github.com/microsoft/mttl 👉 github.com/sordonia/pg_mb… Part of "Dynamic Sparsity in ML" tuto #neurips2024, join for discussions! 😊 thx @zhansu9
I'm on the job market! Please reach out if you are looking to hire someone to work on - RLHF - Efficiency - MoE/Modular models - Synthetic Data - Test time compute - other phases of pre/post-training. If you are not hiring then I would appreciate a retweet! More details👇
We are hiring a Senior Researcher in Montréal! Please consider applying :) More info below
The ML team at @MSFTResearch Montréal 🍁 is hiring a Senior Researcher with a background in ML / NLP!!! Come work with us at the intersection of interactivity, modularity and reasoning in foundation models 😊 MSR is a highly collaborative environment where risky ideas are…
Great opportunity for potential students!
Come study with us at Mila! I will be looking for new students to work with. Our current projects explore continual learning, modularity, scrutability, algorithm discovery, AI for law (reasoning), invariances, and decision-making...
We have a Principal ML Engineer role opening at MSR Montreal. Come and do research with us :) jobs.careers.microsoft.com/global/en/job/…
Presenting this today at 1h30 Vienna time!
[3/3] Towards Modular LLMs by Building and Reusing a Library of LoRAs @LucasPCaccia x.com/_akhaliq/statu…
Made it to Vienna for ICML. Please reach out if you wanna chat!