rasdani
@rasdani_
Founding AI Engineer @ellamindAI, open source AI @DiscoResearchAI abundance is upon us 🚀📈✨ramen enjoyoor 🍜
🧑🏭 🌐 From the depths of Discord servers, where anon AI enthusiasts would mingle to the forefront of European AI research @OpenEuroLLM . 🧑🔬🇪🇺 🙏 It is truly fulfilling to give back and build a strong foundation in Europe🦾🇪🇺 Come join us on this journey at @ellamindAI 🚀
1/4 🇪🇺 Big news! We're joining the @OpenEuroLLM project - Europe's largest open-source AI collaboration yet! Alongside 19 leading institutions, we'll develop next-gen multilingual LLMs that combine performance with European values.
Dropping tomorrow on MLST - the serious problems with Chatbot Arena. We will talk about the recent investment and the explosive paper from Cohere researchers which identified several significant problems with the benchmark.
Recipe to post-train Qwen3 1.7B into a DeepResearch model What does it mean for something small to think deeply? Meet Lucy, a post‑trained Qwen3‑1.7B as a DeepResearch model based on @willccbb's verifiers. Primary Rule-based Rewards: - Answer correctness We check whether the…
I asked @OpenAI 's o3 and @GoogleDeepMind 's Gemini 2.5 Pro to evaluate OpenAI´s and GDM´s IMO 2025 solutions. Result? Narrow victory for OpenAI 🏆 cc @scaling01 @rfurmaniak @GregHBurnham @alexwei_ @tszzl some details below 👇
I have to say in many respects I've had more quality conversations in Berlin than in SF. Never has it happened in SF that people actually pull out pen and paper to talk precisely about hard concepts where surface level talk is just not enough. The overton window is wider and…
I realized at our Berlin event that there are a lot of talented and ambitious young ppl in Europe. Just (almost) no inspiring company to build the future nor VC that have the balls to give them a chance. No wonder why everybody wants to come to sf|
I realized at our Berlin event that there are a lot of talented and ambitious young ppl in Europe. Just (almost) no inspiring company to build the future nor VC that have the balls to give them a chance. No wonder why everybody wants to come to sf|
Berlin I’m in you
📢 First release: 38 monolingual reference LLMs (2.15B params) via @hplt_eu + #OpenEuroLLM ⚙️Trained on 100B tokens from HPLT v2 dataset 🌍 Cover EU langs + others ⚙️ Based on LLaMA, trained on #LUMI 📈 Useful for evaluation Downloads + more info at openeurollm.eu/blog/hplt-oell…
The race for LLM "cognitive core" - a few billion param model that maximally sacrifices encyclopedic knowledge for capability. It lives always-on and by default on every computer as the kernel of LLM personal computing. Its features are slowly crystalizing: - Natively multimodal…
I’m so excited to announce Gemma 3n is here! 🎉 🔊Multimodal (text/audio/image/video) understanding 🤯Runs with as little as 2GB of RAM 🏆First model under 10B with @lmarena_ai score of 1300+ Available now on @huggingface, @kaggle, llama.cpp, ai.dev, and more
recorded my first @LeRobotHF dataset today 🦾🤖 I finally got my hands on it @ErikKaum 😊
v1.1 of RLHF book just pushed with the promised expansion on RLVR / reasoning models. * summary of the major reasoning model reports so far, * common practices/tricks and who used them, * related reasoning work before o1, * async RL, * other improvements. rlhfbook dot com
for those that are too lazy to read papers, Magistral is an RL-only model (think R1-Zero), no distillation from open source traces from larger reasoning models. the magistral paper reports results when SFT is done before RL finally, this is a preview :)
Replay buffers are coming for your LM reinforcement learning recipes. Async is the way. Some early work on it! Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay Sun et al.
Our lunch breaks? More like AI power sessions: We skip the fluff & share real-world AI tips + highlights. Our founder just shared practical tips on mastering Claude Code, our employee of the month. Hearing practical use cases - like using Claude Code's "rewind" to perfect prompts…
Major reasoning models so far with technical reports (focused on those w RL): 2025-01-22 — DeepSeek R1 — arxiv.org/abs/2501.12948 2025-01-22 — Kimi 1.5 — arxiv.org/abs/2501.12599 2025-03-31 — Open-Reasoner-Zero — arxiv.org/abs/2503.24290 2025-04-10 — Seed 1.5-Thinking —…
Another generative / inference-time scaling reward modeling paper. It's the direction things are going.
Super excited to share 💪🧠Reasoning Gym! 🧵 We provide over 100 data generators and verifiers spanning several domains (algebra, arithmetic, code, geometry, logic, games) for training the next generation of reasoning models. In essence, we can generate an infinite amount of…
Introducing Reasoning Gym: Over 100 procedurally generated reasoning environments for evaluation and RLVR of language models. Generate virtually infinite training or evaluation data with fine-grained difficulty control and automatic verifiers. 🧵 1/
We've been thinking about what the "ideal" architecture should look like in the era where inference is driving AI progress. GTA & GLA are steps in this direction: attention variants tailored for inference: high arithmetic intensity (make GPUs go brr even during decoding), easy to…
"Pre-training was hard, inference easy; now everything is hard."-Jensen Huang. Inference drives AI progress b/c of test-time compute. Introducing inference aware attn: parallel-friendly, high arithmetic intensity – Grouped-Tied Attn & Grouped Latent Attn
Introducing LisanBench LisanBench is a simple, scalable, and precise benchmark designed to evaluate large language models on knowledge, forward-planning, constraint adherence, memory and attention, and long context reasoning and "stamina". "I see possible futures, all at once.…