Diego Calanzone

@diegocalanzone

« artificia docuit fames » / PhD student @Mila_Quebec, curr. tackling ⚙️ RL & 🧪 drug design / 🏛️ AI grad @UniTrento

127.0.0.1

Joined April 2015

917Following

236Followers

Diego Calanzone Retweeted

Samuel Lavoie@lavoiems · Jul 22

🧵 Everyone is chasing new diffusion models—but what about the representations they model from? We introduce Discrete Latent Codes (DLCs): - Discrete representation for diffusion models - Uncond. gen. SOTA FID (1.59 on ImageNet) - Compositional generation - Integrates with LLM 🧱

284

209

26.0K

Diego Calanzone@diegocalanzone · Jul 20

good points. Inherent constraints of the competition are part of the outcome performance and they shall be considered. Though Terence didn’t mention that human competitors are running on a negligible fraction of energy ;)

RRota@pli_cachete · Jul 19

Terence Tao on the supposed Gold from OpenAI at IMO

112

Diego Calanzone@diegocalanzone · Jul 18

We blend imitation (SFT) and exploration (RLVR) in post-training with a simple idea: Sample a prefix of an SFT demonstration, let your policy model complete it, and mix it with other RLVR rollouts Intuitively, the model relies more on hints for problems currently out of reach

ZZeyu Huang@ZeroyuHuang · Jul 18

🚀 Introducing Prefix-RFT to blend SFT and RFT! SFT can learn more complex problems by mimicking, but can have poor generalization. RFT has better overall performance but is limited by the initial policy. Our method, Prefix-RFT, makes the best of both worlds!

4.0K

Diego Calanzone@diegocalanzone · Jul 16

partially brewed at Mila

DDeedy@deedydas · Jul 16

Google DeepMind just dropped this new LLM model architecture called Mixture-of-Recursions. It gets 2x inference speed, reduced training FLOPs and ~50% reduced KV cache memory. Really interesting read. Has potential to be a Transformers killer.

Diego Calanzone@diegocalanzone · Jul 7

This is an issue on multiple levels, and authors using those "shortcuts"👀 are equally responsible for this unethical behaviour

PPasquale Minervini@PMinervini · Jul 6

to clarify -- I didn't mean to shame the authors of these papers; the real issue is AI reviewers, what we see here is just the authors trying to defend against that in some way (the proper way would be identifying poor reviews and asking the AC or meta-reviewer to discard them)

292

Diego Calanzone@diegocalanzone · Jun 30

A comprehensive article on ways to Hierarchical RL!

MMartin Klissarov@MartinKlissarov · Jun 27

As AI agents face increasingly long and complex tasks, decomposing them into subtasks becomes increasingly appealing. But how do we discover such temporal structure? Hierarchical RL provides a natural formalism-yet many questions remain open. Here's our overview of the field🧵

Diego Calanzone Retweeted

Stella Biderman @ ICML@BlancheMinerva · Jun 14

A good warning lesson on using AIs to write papers: this alleged response to the (dubious) "Illusion of Thinking" paper is full of mathematical errors arxiv.org/abs/2506.09250

441

241

55.0K

Diego Calanzone@diegocalanzone · Jun 6

Two years in the making, we finally have 8 TB of openly licensed data with document-level metadata for authorship attribution, licensing details, links to original copies, and more. Hugely proud of the entire team.

EEleutherAI@AiEleuther · Jun 6

Can you train a performant language models without using unlicensed text? We are thrilled to announce the Common Pile v0.1, an 8TB dataset of openly licensed and public domain text. We train 7B models for 1T and 2T tokens and match the performance similar models like LLaMA 1&2

561

150

33.0K

Diego Calanzone@diegocalanzone · May 17

The language modeling version

wwill brown@willccbb · May 16

cautionary tale for anyone eagerly diving into RL training LLMs are a bit more forgiving, but only so much you should still do it though

297

293

36.0K

Diego Calanzone Retweeted

Lilian Weng@lilianweng · May 17

Giving your models more time to think before prediction, like via smart decoding, chain-of-thoughts reasoning, latent thoughts, etc, turns out to be quite effective for unblocking the next level of intelligence. New post is here :) “Why we think”: lilianweng.github.io/posts/2025-05-…

434

3.0K

2.0K

215.0K

Diego Calanzone@diegocalanzone · May 15

The beauty of rethinking architectures

SSakana AI@SakanaAILabs · May 12

Introducing Continuous Thought Machines New Blog: sakana.ai/ctm/ Modern AI is powerful, but it’s still distinct from human-like flexible intelligence. We believe neural timing is key. Our Continuous Thought Machine is built from the ground up to use neural dynamics as…

Diego Calanzone@diegocalanzone · Apr 27

Presenting in short! 👉🏼 Mol-MoE: leveraging model merging and RLHF for test-time steering of molecular properties. 📆 today, 11:15am to 12:15pm 📍 Poster session #1, GEM Bio Workshop @gembioworkshop @sparseLLMs #ICLR #ICLR2025

DDiego Calanzone@diegocalanzone · Apr 23

In Mol-MoE, we propose a framework to train router networks to reuse property-specific molecule generators. This allows to personalize drug generation at test time by following property preferences! We discuss some challenges. @proceduralia @pierrelux arxiv.org/abs/2502.05633

583

Diego Calanzone Retweeted

Jack Parker-Holder@jparkerholder · Apr 26

@PaglieriDavide presenting BALROG, to an awesome crowd ft @CULLYAntoine come here to see how frontier models perform in open-ended gaming environments

4.0K

Diego Calanzone Retweeted

Sophie Xhonneux@SophieXhon11060 · Apr 25

If you are at @iclr_conf and are interested in making your RLHF really fast come find @mnoukhov and me at poster #582.

6.0K

Diego Calanzone@diegocalanzone · Apr 26

Imagine an anonymous poster consisting of handwritten pages with methodology and results. “What matters is spreading the idea” #ICLR

diegocalanzone's tweet image. Imagine an anonymous poster consisting of handwritten pages with methodology and results. “What matters is spreading the idea” #ICLR

121

Diego Calanzone@diegocalanzone · Apr 24

Happening now in Hall 3, booth 223!

DDiego Calanzone@diegocalanzone · Apr 23

In LoCo-LMs, we propose a neuro-symbolic loss function to fine-tune a LM to acquire logically consistent knowledge from a domain graph, i.e. wrt. to a set of logical consistency rules. @looselycorrect @tetraduzione arxiv.org/abs/2409.13724

2.0K