Benno Krojer
@benno_krojer
AI phding @Mila_Quebec @mcgillu (past: @AIatMeta). Interests: interpretability, language grounding (V+L), evals, reasoning. Vanier Scholar. 🥏⚽🥨
Love to see this I am always hoping for papers that show that text-only understanding is influenced by being physically grounded (images, videos, interaction) It was a big hope of people years ago with few positive findings, glad it is still explored!
Does vision training change how language is represented and used in meaningful ways?🤔 The answer is a nuanced yes! Comparing VLM-LM minimal pairs, we find that while the taxonomic organization of the lexicon is similar, VLMs are better at _deploying_ this knowledge. [1/9]
A blizzard is raging in Montreal when your friend says “Wow, the weather is amazing!” Humans easily interpret irony, while LLMs struggle at it. We propose a 𝘳𝘩𝘦𝘵𝘰𝘳𝘪𝘤𝘢𝘭-𝘴𝘵𝘳𝘢𝘵𝘦𝘨𝘺-𝘢𝘸𝘢𝘳𝘦 probabilistic framework as a solution. arxiv.org/abs/2506.09301 @ #acl2025
Love this series of blogpost - very insightful peek at the process that goes behind a cool paper! I’d love to see more authors come up with posts like these
This was an incredibly important project to me - I’ve wanted to solve it for years, but had no idea how. This was all @sukjun_hwang and @fluorane's amazing work! I wrote about the story of its development, and what might be coming next. The H-Net: goombalab.github.io/blog/2025/hnet…
Very cool stuff. I was amazed to learn from Tom that there is a programming language RASP that compiles symbolic algorithms into transformer weights He built on top of that and studied how we can distill those "idealized transformers" into our regular LLMs
🥳 New Paper @ ACL Findings 🇦🇹 Instead of reverse engineering mechanisms in LLMs, can we inject our own known mechanism into a pretrained language model? Yes we can!
Cohere is excited to announce our new office in Montreal, QC! We look forward to contributing to the local AI landscape, collaborating with new and existing partners in the city, and growing our Montreal-based team. cohere.com/blog/montreal-…
I genuinely think @benno_krojer's work offers a much fairer and insightful way to assess the physics understanding of VideoLLMs. Highly recommend giving it a read if you're curious about where current models stand!
🥳 New Paper @ ACL Findings 🇦🇹 Instead of reverse engineering mechanisms in LLMs, can we inject our own known mechanism into a pretrained language model? Yes we can!
Welcome to the lab, doctor!
I miss Edinburgh and its wonderful people already!! Thanks to @tallinzen and @PontiEdoardo for inspiring discussions during the viva! I'm now exchanging Arthur's Seat for Mont Royal to join @sivareddyg's wonderful lab @Mila_Quebec 🤩
The video is online now! 3min speed science talk on "From a soup of raw pixels to abstract meaning" youtu.be/AHsoMYG2Vqk?si…
Turns out condensing your research into 3min is very hard but also teaches you a lot
Cool use of our AURORA work from last year to improve physical world models framed as image editing!
🔁 What if you could bootstrap a world model (state1 × action → state2) using a much easier-to-train dynamics model (state1 × state2 → action) in a generalist VLM? 💡 We show how a dynamics model can generate synthetic trajectories & serve for inference-time verification 🧵👇
"Build the web for agents, not agents for the web" This position paper argues that rather than forcing web agents to adapt to UIs designed for humans, we should develop a new interface optimized for web agents, which we call Agentic Web Interface (AWI).