Joaquin Vanschoren
@joavanschoren
AI Researcher, @open_ml founder, research lead @TUeindhoven. Building AI systems that learn how to learn, grow and adapt continuously & push humanity forward.
🚀 Ready to push the boundaries of #AI & #ML? We're hiring 7(!) brilliant PhDs, PostDocs, and Engineers to work on cutting-edge #LLMs #MultiModal #GenAI @TUeindhoven ✨ Think big and shape the future of AI! 🧑💻👩💻 Apply now! amore-labs.github.io/website/join/j… ❣️Please share❣️
Knowledge graphs are a game changer for AI Agents! A few ridiculous and eye-opening benchmarks comparing an AI Agent using knowledge graphs with state-of-the-art methods: • 94.8% accuracy versus 93.4% in the Deep Memory Retrieval (DMR) benchmark. • 71.2% accuracy versus…
I just put on line a talk I gave summarizing what I have learned across the years as a maintainer of open source. It's _opinions_ (been there, done that), but I'm willing to defend them, having stewarded my share of successful open source projects. speakerdeck.com/gaelvaroquaux/…
New paper: "Large Language Models & Emergence: A Complex Systems Perspective" (D. Krakauer, J. Krakauer, M. Mitchell). We look at claims of "emergent capabilities" & "emergent intelligence" in LLMs from perspective of what emergence means in complexity science. ⬇️
If you’re interested in learning about the theory behind Muon (a new optimizer), Jeremy has a great explainer in this thread. Also check out all his work leading to this (modula, modular duality, etc): docs.modula.systems/intro/reading-… It’s a beautiful theory and seems to work too!
It's been wild to see our work on Muon and the anthology start to get scaled up by the big labs. After @Kimi_Moonshot released Moonlight, people have asked whether Muon is compatible with muP. I wanted to write up an explainer, as there is something deeper going on here! (1/8)
Hypothesis generation and testing is a critical capability for AGI imo. Super excited about our AI co-scientist and other AI for Science work which are important steps towards that. We're on the cusp of an incredible new golden age of AI accelerated scientific discovery.
Diffusion language models are SO FAST!! A new startup, Inception Labs, has released Mercury Coder, "the first commercial-scale diffusion large language model" It's 5-10x faster than current gen LLMs, providing high-quality responses at low costs. And you can try it now!
First 11 chapters of RLHF Book have v0 draft done. Should be quick useful now. Next: * Crafting more blog content into future topics, * DPO+ chapter, * Meeting with publishers to get wheels turning on physical copies, * Cleaning & cohesiveness
Can we prompt robots, just like we prompt language models? With hierarchy of VLA models + LLM-generated data, robots can: - reason through long-horizon tasks - respond to variety of prompts - handle situated corrections Blog post & paper: pi.website/research/hirob…
This is something we have been cooking together for a few months and I'm very excited to announce it today. Thinking Machines Lab is my next adventure and I'm feeling very proud and lucky to start it with a group of talented colleagues. Learn more about our vision at…
Today, we are excited to announce Thinking Machines Lab (thinkingmachines.ai), an artificial intelligence research and product company. We are scientists, engineers, and builders behind some of the most widely used AI products and libraries, including ChatGPT,…
Understanding GPU bottlenecks is easy with a visualisation 👨🏻🍳
From an open-research point of view, probably the greatest thing about DeepSeek–R1 is how its RL training technique appears so straightforward and simple in comparison to the cumbersome approaches people were starting to think necessary for learning reasoning like Process Reward…
Make your RAG application 10x smarter! ColiVara is a unique document retrieval method that does not need chunking or text processing. It still feels like RAG but without OCR, text extraction, broken tables, or missing images. What you see is what you get. ✨ Here’s why it’s a…
🚀 Personal News: My Book "LLMOps: Managing Large Language Models in Production" is finally in early release with three chapters📚 🌟 What You’ll Get: Chapter 1 puts a lot of fundamental concepts in perspective from Language Model architectures to SLMs Chapter 2 goes into the…
Over 20 European orgs/companies — backed with €54m from the EU Commission — have joined the "OpenEuroLLM" project to develop Open Source models for Europe. But will it put Europe back on the AI map? thenextweb.com/news/european-… openeurollm.eu