Subham Sahoo
@ssahoo_
PhD candidate @cornell working on Diffusion Language Models. Previously @GoogleAI, @IITKgp.
🚨 “The Diffusion Duality” is out! @ICML2025 ⚡️ Few-step generation in discrete diffusion language models by exploiting the underlying Gaussian diffusion. 🦾Beats AR on 3/7 zero-shot likelihood benchmarks. 📄 Paper: arxiv.org/abs/2506.10892 💻 Code: github.com/s-sahoo/duo 🧠…
📢 Excited to announce that GenMol is now open-sourced. GenMol: A Drug Discovery Generalist with Discrete Diffusion Paper: arxiv.org/abs/2501.06158 Code: github.com/NVIDIA-Digital…
🚀 GenMol is now open‑sourced: you can now train and finetune on your data! It uses masked diffusion + a fragment library to craft valid SAFE molecules, from de novo design to lead optimization. #GenMol #DrugDiscovery #Biopharma
🚨 The era of infinite internet data is ending, So we ask: 👉 What’s the right generative modelling objective when data—not compute—is the bottleneck? TL;DR: ▶️Compute-constrained? Train Autoregressive models ▶️Data-constrained? Train Diffusion models Get ready for 🤿 1/n
📢 Duo and Eso-LMs at 2B scale on Slim Pajama These models will finish training in a few days. While HF release may take time due to corporate red tape, we'll try providing early access case-by-case. Email [email protected] with the subject “Early access”. Duo:…
Attending ICML ✈️Tues-Fri to present "The Diffusion Duality" 🗓️Wed, July 16 @ 4:30pm 📍East Exhibition Hall A-B (E-3003) DM if you want to chat about diffusion LMs, or my current work on Duality or Esoteric LMs! x.com/ssahoo_/status…
🚨 “The Diffusion Duality” is out! @ICML2025 ⚡️ Few-step generation in discrete diffusion language models by exploiting the underlying Gaussian diffusion. 🦾Beats AR on 3/7 zero-shot likelihood benchmarks. 📄 Paper: arxiv.org/abs/2506.10892 💻 Code: github.com/s-sahoo/duo 🧠…
🚨 [New paper alert] Esoteric Language Models (Eso-LMs) First Diffusion LM to support KV caching w/o compromising parallel generation. 🔥 Sets new SOTA on the sampling speed–quality Pareto frontier 🔥 🚀 65× faster than MDLM ⚡ 4× faster than Block Diffusion 📜 Paper:…
Ouch, my ego took a hit. Chemistry is a subject that can be gamed with rote learning, yet surprisingly, Gemini performs worse in it than in physics and math.
AI now beats every single human in the hardest college entrance exam in India, the IIT JEE. Bytedance silently published this result this week. The top scorer was Rajit Gupta with 332/360, but Google's Gemini 2.5 Pro was at rank 1 with 336/360.
I screwed over one of my top engineers when I was a Senior Manager at Amazon. He felt betrayed, found another job, and resigned. This is a dark spot on my career, so learn from my mistake. Here’s the story:
🌟 Esoteric Language Models: гибридные AR+MDM языковые модели. Eso-LM (s-sahoo.com/Eso-LMs/) - это новый класс языковых моделей, сочетающий автогрегрессионные (AR) и маскированные диффузионные методы (MDM), чтобы сбалансировать качество генерации и
学术界又现重大突破!康奈尔大学、CMU等多机构研究者共同提出Esoteric Language Models(Eso - LMs)这一创新语言建模框架,堪称语言模型领域的一次大胆革新 Eso -…
The Diffusion Duality few-step generation in discrete diffusion language models via the underlying gaussian diffusion
Checkout out "The Diffusion Duality" on HF papers! huggingface.co/papers/2506.10… Also see the author's collection: huggingface.co/collections/s-…
New on HF Papers: The Diffusion Duality 🤯! Unlock few-step generation in discrete diffusion language models via the underlying Gaussian diffusion. Code & models: github.com/s-sahoo/duo
The Diffusion Duality Sahoo et al.: arxiv.org/abs/2506.10892 #ArtificialIntelligence #DeepLearning #MachineLearning
The Diffusion Duality Sahoo et al.: arxiv.org/abs/2506.10892 #ArtificialIntelligence #DeepLearning #MachineLearning
NEW RESEARCH: Approximating Language Model Training Data from Weights ever wonder how much information is available in an open-weights model? DeepSeek R1 weights are 1.2 TB... what can we learn from all those bits? our method reverses LLM finetuning to recover data: 🧵
A new paper just dropped: The Diffusion Duality youtube.com/watch?v=0eaGzf…
3. The Diffusion Duality 🔑 Keywords: diffusion models, Gaussian diffusion, curriculum learning, discrete consistency distillation, text generation 💡 Category: Generative Models 🌟 Research Objective: To enhance the performance of uniform-state discrete diffusion models for…
📖 「The Diffusion Duality」に関する公式論文。技術の詳細を深く掘り下げたい方はこちら! huggingface.co/papers/2506.10…