Jean-François Ton

@jeanfrancois287

ByteDance Seed @ByteDance_Seed | Senior Research Scientist working on LLMs | prev. @oxcsml @UniofOxford, @amazon, @apple, @bloomberg All opinions are my own

London, United Kingdon

Joined January 2015

1KFollowing

1KFollowers

Pinned

Jean-François Ton@jeanfrancois287 · Jul 16

📢 New paper on Multi-Agent LLMs 📢 Our new paper presents Multi-agent-guided Leader Policy Optimisation (MLPO). We train a single leader LLM that steers a team of off-the-shelf agents to solve tasks. Detailed thread below 👇 🧵 arxiv.org/abs/2507.08960

jeanfrancois287's tweet image. 📢 New paper on Multi-Agent LLMs 📢
Our new paper presents Multi-agent-guided Leader Policy Optimisation (MLPO). We train a single leader LLM that steers a team of off-the-shelf agents to solve tasks.
Detailed thread below 👇 🧵
arxiv.org/abs/2507.08960

2.0K

Pinned

Jean-François Ton@jeanfrancois287 · Jul 25

This has to be a joke, right? right?

YYiping Lu@2prime_PKU · Jul 25

Anyone knows adam?

243

Jean-François Ton@jeanfrancois287 · Jul 24

Right before #imo2025, together with colleagues from Mountain View, NYC, Singapore, etc, we all gathered at @GoogleDeepMind headquarter in London for our final push for IMO. I believe that week was when all magic happened! We put all individual recipes (that we figured out…

TThang Luong@lmthang · Jul 21

Very excited to share that an advanced version of Gemini Deep Think is the first to have achieved gold-medal level in the International Mathematical Olympiad! 🏆, solving five out of six problems perfectly, as verified by the IMO organizers! It’s been a wild run to lead this…

524

101

81.0K

Jean-François Ton Retweeted

Qwen@Alibaba_Qwen · Jul 22

>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…

276

1.0K

9.0K

4.0K

1.8M

Jean-François Ton Retweeted

Xiao Ma@yusufma555 · Jul 22

🚀🚀🚀 Ever wondered what it takes for robots to handle real-world household tasks? long-horizon execution, deformable object dexterity, and unseen object generalization — meet GR-3, ByteDance Seed’s new Vision-Language-Action (VLA) model! GR-3 is a generalizable…

494

290

42.0K

Jean-François Ton Retweeted

Owain Evans@OwainEvans_UK · Jul 22

New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵

279

1.0K

8.0K

5.0K

1.8M

Jean-François Ton Retweeted

YIFENG LIU@YIFENGLIU_AI · May 26

1/6 We introduce RPG, a principled framework for deriving and analyzing KL-regularized policy gradient methods, unifying GRPO/k3-estimator and REINFORCE++ under this framework and discovering better RL objectives than GRPO: Paper: arxiv.org/abs/2505.17508 Code:…

199

163

50.0K

Jean-François Ton Retweeted

Sheheryar Zaidi@ShehZaidi · Jul 21

Exciting and unique opportunity in our team at @GoogleDeepMind: we're hiring a laboratory scientist to build out a materials-synthesis lab 🧪 This lab will be a key part of our roadmap for training AI models capable of real-world materials discovery 🔁 job-boards.greenhouse.io/deepmind/jobs/…

1.0K

Jean-François Ton Retweeted

Hao Sun - RL @ACL 🇦🇹@HolarisSun · Jul 21

🚀 RL is powering breakthroughs in LLM alignment, reasoning, and agentic apps. Are you ready to dive into the RL x LLM frontier? Join us at @aclmeeting ACL’25 tutorial: Inverse RL Meets LLM Alignment this Sunday at Vienna🇦🇹(Jul 27th, 9am) 📄 Preprint at huggingface.co/papers/2507.13…

4.0K

Jean-François Ton Retweeted

Pratyush Maini@pratyushmaini · Jul 16

At #ICML2025, I am super excited to introduce STAMP. This is a marriage b/w dataset inference & watermarking that finally(!) lets creators PROVE their content was used to train LLMs🔍 Its a MAJOR push taking the academic problem into real world. w/Saksham Rastogi @danish037 🧵

103

13.0K