Stefan Horoi

@stefanhoroi

PhD student at @UMontreal and @Mila_Quebec, currently working on model merging and representation comparison.

Montréal, Québec

Joined December 2016

165Following

46Followers

Pinned

Stefan Horoi@stefanhoroi · Jul 8

🔎Do better expert models always lead to better model merging & MoErging? And how does expert training (duration) affect model upcycling? We tackle these questions in our recent work: “Less is More: Undertraining Experts Improves Model Upcycling” 🧵1/N

1.0K

Stefan Horoi Retweeted

Benjamin Thérien@benjamintherien · Mar 12

How do MoE transformers, like DeepSeek, behave under distribution shifts? Do their routers collapse? Can they still match full re-training performance? Excited to present “Continual Pre-training of MoEs: How robust is your router?”!🧵arxiv.org/abs/2503.05029 1/N

4.0K

Stefan Horoi@stefanhoroi · Jul 23, 2024

Very excited to present our paper "Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis" at @icmlconf 2024! Come see our poster tomorrow, Wed. July 24th 1h30-3pm Paper: openreview.net/forum?id=hLuNV… Code: github.com/shoroi/align-n… @Mila_Quebec #ICML2024

1.0K

Stefan Horoi@stefanhoroi · Jun 14, 2017

Mes remerciements les plus sincères à la Fondation Schulich, à M. Seymour Schulich et à l'Université de Montréal! #2017SLSquad

stefanhoroi's tweet image. Mes remerciements les plus sincères à la Fondation Schulich, à M. Seymour Schulich et à l'Université de Montréal! #2017SLSquad