Dane Malenfant

@dvnxmvl_hdf5

MSc. Computer Science @Mila_Quebec & @mcgillu in the LiNC lab | Currently distracted with multi-agent RL and neuroAI | Restless | Ēka ē-akimiht

Montréal, Québec

Joined December 2023

268Following

123Followers

Pinned

Dane Malenfant@dvnxmvl_hdf5 · Jun 5

Preprint Alert 🚀 Multi-agent reinforcement learning (MARL) often assumes that agents know when other agents cooperate with them. But for humans, this isn’t always true. Example, plains indigenous groups used to leave resources for others to use at effigies called Manitokan. 1/8

dvnxmvl_hdf5's tweet image. Preprint Alert 🚀
Multi-agent reinforcement learning (MARL) often assumes that agents know when other agents cooperate with them. But for humans, this isn’t always true. Example, plains indigenous groups used to leave resources for others to use at effigies called Manitokan.
1/8

3.0K

Pinned

Dane Malenfant@dvnxmvl_hdf5 · Jul 19

Having competed in the IMO, I find the generalist training this work provides to be its most incredible aspect. 🧙‍♂️ No more rabbithole specialization as we did in high school, thinking specifically about olympiad problem solving 🤔, eating and sleeping olympiad problems.…

AAlexander Wei@alexwei_ · Jul 19

1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).

4.0K

Dane Malenfant Retweeted

Guan Wang@makingAGI · Jul 21

🚀Introducing Hierarchical Reasoning Model🧠🤖 Inspired by brain's hierarchical processing, HRM delivers unprecedented reasoning power on complex tasks like ARC-AGI and expert-level Sudoku using just 1k examples, no pretraining or CoT! Unlock next AI breakthrough with…

182

546

4.0K

3.0K

1.1M

Dane Malenfant Retweeted

Akhil Bagaria@akhil_bagaria · Jul 25

🚨New workshop alert 🚨 Calling all RL researchers in the New York area 🧠🗽 Present your work at the first-ever New York Reinforcement Learning Workshop (NYRL), co-organized by Amazon, Columbia Business School & NYU Tandon School of Engineering. ny-rl.com

7.0K

Dane Malenfant Retweeted

Chen Sun 🤖🧠🇨🇦@ChenSun92 · Jul 25

Our team at @GoogleDeepMind is looking to hire a talented new Research Scientist! Our group (under @edchi) aims to push the frontier of AI-human interactions through personalization of LLMs and deeply understanding the open-ended nature of user intentions. Beneath this lies…

250

120

24.0K

Dane Malenfant@dvnxmvl_hdf5 · Jul 25

After living in Montréal for 7 years, I’ve always really liked being able to walk at night

JJust Bins@JustBins · Jun 20

Welcome to Prince Albert.

209

Dane Malenfant@dvnxmvl_hdf5 · Jul 20

Ethan is spot on. Sadly, widespread hostility toward AI in the humanities means they miss opportunities to shape the tools, platforms, and debates of tomorrow. That reluctance only deepens the marginalisation of cultural and historical knowledge in our society.

EEthan Mollick@emollick · Jul 20

Don't leave AI to the STEM folks. They are often far worse at getting AI to do stuff than those with a liberal arts or social science bent. LLMs are built from the vast corpus human expression, and knowing the history & obscure corners of human works lets you do far more with AI

174

12.0K

Dane Malenfant Retweeted

Claas Voelcker@c_voelcker · Jul 18

🔥🚨 Preprint alert: Relative Entropy Pathwise Policy Optimization #REPPO 🚨🔥 What if you could have on-policy training without the instability and parameter tuning that plagues #PPO? What if training with deterministic policy gradient just worked? With our new method it does!

6.0K

Dane Malenfant@dvnxmvl_hdf5 · Jul 18

As the field moves towards agents doing science, the ability to understand novel environments through interaction becomes critical. AutumnBench is an attempt at measuring this abstract capability in both humans and current LLMs. Check out the blog post for more insights!

BBasis@BasisOrg · Jul 17

We’re proud to announce the launch of AutumnBench, an open-source benchmark developed on our Autumn platform. This benchmark, led by our MARA team, provides a novel platform for evaluating world modeling and causal reasoning in both human and artificial intelligence.

1.0K

Dane Malenfant@dvnxmvl_hdf5 · Jul 17

I really like how @weights_biases plots the KL in multi-agent PPO

977

Dane Malenfant Retweeted

Mila - Institut québécois d'IA@Mila_Quebec · Jul 16

That’s a wrap on the Indigenous AI Gathering 2025. Over two days, Indigenous knowledge keepers, technologists, youth, and allies came together to chart new paths in AI inclusive of Indigenous ethics and perspectives. Thank you for being part of this vital conversation. With…

1.0K

Dane Malenfant Retweeted

Eric Elmoznino@EricElmoznino · Jul 15

Recently went on a podcast to talk about consciousness and AI. Very timely given how common interactions with AI systems are becoming, and how convincingly "human" LLMs can appear to be in our conversations with them. Was a fun conversation - check it out! youtu.be/C7kGlnVWHjQ

1.0K

Dane Malenfant Retweeted

Joey Bose@bose_joey · Jul 9

🎉Personal update: I'm thrilled to announce that I'm joining Imperial College London @imperialcollege as an Assistant Professor of Computing @ICComputing starting January 2026. My future lab and I will continue to work on building better Generative Models 🤖, the hardest…

603

51.0K

Dane Malenfant Retweeted

Thomas Möllenhoff@tmoellenhoff · Jul 7

Excited to announce our recent work on low-precision deep learning via biologically-inspired noisy log-normal multiplicative dynamics (LMD). It allows us to train large neural nets (such as GPT-2 and ViT) in FP6. arxiv.org/abs/2506.17768

110

13.0K

Dane Malenfant@dvnxmvl_hdf5 · Jul 6

Even simple tasks like sharing objects for others can be hard to learn

AAmanda Askell@AmandaAskell · Jul 5

"Just train the AI models to be good people" might not be sufficient when it comes to more powerful models, but it sure is a dumb step to skip.

299

Dane Malenfant@dvnxmvl_hdf5 · Jul 4

What does it mean when I read IRL as inverse reinforcement learning instead of in real life?

195

Dane Malenfant@dvnxmvl_hdf5 · Jul 1

In our latest paper, we discovered a surprising result: training LLMs with self-play reinforcement learning on zero-sum games (like poker) significantly improves performance on math and reasoning benchmarks, zero-shot. Whaaat? How does this work? We analyze the results and find…

BBo Liu (Benjamin Liu)@Benjamin_eecs · Jul 1

We've always been excited about self-play unlocking continuously improving agents. Our insight: RL selects generalizable CoT patterns from pretrained LLMs. Games provide perfect testing grounds with cheap, verifiable rewards. Self-play automatically discovers and reinforces…

273

149

26.0K