Raphaël Sourty

@raphaelsrty

Language Models, Knowledge Bases, Knowledge Distillation PhD | AI @LightonIO

Paris, France

Joined May 2020

764Following

703Followers

Pinned

Raphaël Sourty@raphaelsrty · Jun 4

I'm thrilled to announce the release of FastPlaid ! 🚀🚀 FastPlaid is a high-performance engine for multi-vector search, built from the ground up in Rust (with the help of Torch C++)⚡️ You can view FastPlaid as the counterpart of Faiss for multi vectors.

247

185

39.0K

Raphaël Sourty Retweeted

Qwen@Alibaba_Qwen · Jul 25

🚀 We’re excited to introduce Qwen3-235B-A22B-Thinking-2507 — our most advanced reasoning model yet! Over the past 3 months, we’ve significantly scaled and enhanced the thinking capability of Qwen3, achieving: ✅ Improved performance in logical reasoning, math, science & coding…

179

600

4.0K

886

720.0K

Raphaël Sourty@raphaelsrty · Jul 20

Great excuse to share something I really love: 1-Lipschitz nets. They give clean theory, certs for robustness, the right loss for W-GANs, even nicer grads for explainability!! Yet are still niche. Here’s a speed-run through some of my favorite papers on the field. 🧵👇

ccider@jeffreycider · Jul 20

optimization theorem: "assume a lipschitz constant L..." the lipschitz constant:

424

417

56.0K

Raphaël Sourty Retweeted

Qwen@Alibaba_Qwen · Jul 24

🚀 Introducing Qwen3-MT – our most powerful translation model yet! Trained on trillions of multilingual tokens, it supports 92+ languages—covering 95%+ of the world’s population. 🌍✨ 🔑 Why Qwen3-MT? ✅ Top-tier translation quality ✅ Customizable: terminology control, domain…

323

2.0K

849

240.0K

Raphaël Sourty Retweeted

Knowledgator@knowledgator · Jul 23

🚀 Introducing GLiClass‑V3 – a leap forward in zero-shot classification! Matches or beats cross-encoder accuracy, while being up to 50× faster. Real-time inference is now possible on edge hardware. huggingface.co/collections/kn… #TextClassification #NLP #ZeroShot #GLiClass

4.0K

Raphaël Sourty@raphaelsrty · Jul 11

Ok the solution might be way easier than expected ... it's been a long time since we did not release a SOTA model, isn't it? 😇

AAntoine Chaffin@antoine_chaffin · Jul 9

I am starting to be more and more convinced that MaxSim generalize very well to long documents but struggles on longer query, most probably due to the asymmetry Larger documents are bound by the number of query tokens, but larger queries might get noisy Either it is a query…

2.0K

Raphaël Sourty Retweeted

Qwen@Alibaba_Qwen · Jul 21

Bye Qwen3-235B-A22B, hello Qwen3-235B-A22B-2507! After talking with the community and thinking it through, we decided to stop using hybrid thinking mode. Instead, we’ll train Instruct and Thinking models separately so we can get the best quality possible. Today, we’re releasing…

215

580

4.0K

845

917.0K

Raphaël Sourty Retweeted

tomaarsen@tomaarsen · Jul 17

Some of the ModernBERT team is back with new encoder models: Ettin, ranging from tiny to small: 17M, 32M, 68M, 150M, 400M & 1B parameters. They also trained decoder models & checked if decoders could classify & if encoders could generate. Details in 🧵:

271

163

15.0K

Raphaël Sourty Retweeted

Manuel Faysse@ManuelFaysse · Jul 17

Introducing ColQwen-Omni, a 3B omnimodal retriever that extends the ColPali concept of multimodal retrieval with late interaction to audio chunks and short videos, with no performance degradation on visual document retrieval wrt our best models! (1/N)

102

593

521

54.0K

Raphaël Sourty Retweeted

Hugues Van Assel@hugues_va · Jul 16

TorchDR 0.3 is here with some major improvements, taking the library to the next level! TorchDR leverages vectorized implementation on GPU for super fast dimensionality reduction. Thanks to all the contributors!! Description below🧵

532

Raphaël Sourty Retweeted

LightOn@LightOnIO · Jul 16

✊ Transformers... Assemble! Introducing ♊Ettin Suite, a SoTA open recipe to outperform existing Generative & Retrieval Models. Developed by @JohnsHopkins in collaboration with @LightOnIO, Ettin is the first-ever SoTA suite of paired encoder & decoder models. The revolution…

2.0K

Raphaël Sourty@raphaelsrty · Jul 16

To anyone wondering what's the difference between encoders and decoders on downstream tasks when both models are trained the same way, this blog post is made for you. Very interesting resource and new models available, impressive work 🙌

AAntoine Chaffin@antoine_chaffin · Jul 16

Should we just focus our pre-training efforts on decoders? To answer this, we trained Ettin, various identically trained encoders and decoders, ranging from 17M to 1B parameters on 2T tokens of open data (beating Llama 3.2 and ModernBERT in the process)!

1.0K

Raphaël Sourty Retweeted

Orion Weller@orionweller · Jul 16

🤔 Have you ever wondered how good ModernBERT is compared to decoders like Llama? We made an open-data version of ModernBERT and used the same recipe for encoders and decoders. Turns out, our encoder model beat ModernBERT and our decoder model beats Llama 3.2 / SmolLM2 🤯 🧵

214

116

25.0K

Raphaël Sourty@raphaelsrty · Jul 15

The #SIGIR2025 Best Paper just awarded to the WARP engine for fast late interaction! Congrats to Luca Scheerer🎉 WARP was his @ETH_en MS thesis, completed while visiting us at @StanfordNLP. Incidentally, it's the fifth Paper Award for a ColBERT paper since 2020!* Luca did an…

OOmar Khattab@lateinteraction · Jul 14

📢 If you’re at #SIGIR2025 this week, make sure to be at Luca Scheerer’s paper talk: “WARP: An Efficient Engine for Multi-Vector Retrieval” (Wednesday 11am) WARP makes PLAID, the famous ludicrously fast ColBERT engine, another 3x faster on CPUs. With the usual ColBERT quality!

184

28.0K

Raphaël Sourty@raphaelsrty · Jul 15

Alright that’s it WARP coming to PyLate soon™️

OOmar Khattab@lateinteraction · Jul 15

3.0K

Raphaël Sourty@raphaelsrty · Jul 15

This was a really enjoyable and approachable blog post. I think this is my favorite explanation of MaxSim and I'm going to use it moving forward. But this snippet doesn't do the whole post justice---read the whole thing!

BBen Clavié@bclavie · Jul 15

If you've made it this far down the thread, you might want a link reminder, so here you are: Github: github.com/mixedbread-ai/… Blog: mixedbread.com/blog/maxsim-cpu

407

Raphaël Sourty@raphaelsrty · Jul 15

Awesome new lib from @bclavie to make MaxSim computation quicker on CPU ! In the same vibe, also need to underline pylate-rs by @raphaelsrty ! Both libs go a long way into making Late Interaction (ColPali, ColBert) ever more accessible !

BBen Clavié@bclavie · Jul 15

New blog post & new library are out now! The BP is about MaxSim, why it's *orders of magnitude* much more demanding than normal cosine similarity, and why GPUs don't care, but CPUs do! The library is maxsim-cpu, which makes it so CPUs can be fast and play it cool, too.

3.0K