Alan Dao

@alandao_ai

AI Researcher at Menlo Research. Author of Lucy, Jan-nano, Ichigo, AlphaMaze, and various other works at Menlo Research.

Joined April 2020

16Following

161Followers

Pinned

Alan Dao Retweeted

Rohan Paul@rohanpaul_ai · Jul 25

Stanford Graduate School of Business students say the MBA they paid about $250,000 for leans on decade‑old slides, and their classes feel locked in 2010. the grade often hinges on writing a clever prompt or paying for a premium AI tool, not on grasping the math behind decisions.…

4.0K

Alan Dao@alandao_ai · 1 h

some new thing behind this release

QQwen@Alibaba_Qwen · 1 h

🚀 GSPO: Group Sequence Policy Optimization — a breakthrough RL algorithm for scaling LMs! 🔹 Sequence-level optimization — theoretically sound & matching reward 🔹 Rock-solid stability for large MoE models — no collapse 🔹 No hacks like Routing Replay — simpler, cleaner…

4.0K

Alan Dao Retweeted

Yacine Mahdid@yacinelearning · 16 h

for those struggling with deep learning math I resurface this tutorial I made last year in 27min I show my process by working through QHAdam optimizer which is this alphabet soup of letters

184

2.0K

3.0K

117.0K

Alan Dao@alandao_ai · 11 h

I believe next month will be more

CCasper Hansen@casper_hansen_ · 20 h

what an intense month it has been. free models we received in the last few weeks: - kimi k2 1t 32b - qwen3 235b a22b non-thinking - qwen3 coder 480b a35b - new magistral 24b from mistral - qwen3 235b a22b reasoning - step3 321b a38b with new mfa attention - smollm3 3b - intern…

114

Alan Dao@alandao_ai · 23 h

Have been awhile for InternLM, a cool new project!

IInternLM@intern_lm · Jul 26

🚀Introducing Intern-S1, our most advanced open-source multimodal reasoning model yet! 🥳Strong general-task capabilities + SOTA performance on scientific tasks, rivaling leading closed-source commercial models. 🥰Built upon a 235B MoE language model and a 6B Vision encoder.…

Alan Dao Retweeted

InternLM@intern_lm · Jul 26

574

232

70.0K

Alan Dao Retweeted

Awni Hannun@awnihannun · Jul 25

The latest MLX has a CUDA back-end! To get started: pip install "mlx[cuda]" With the same codebase you can develop locally, run your model on Apple silicon, or in the cloud on Nvidia GPUs. MLX is designed around Apple silicon - which has a unified memory architecture. It uses…

395

136

35.0K

Alan Dao@alandao_ai · Jul 26

Qwen deepseek moment Wait

562

Alan Dao@alandao_ai · Jul 25

TL;DR: Open source AI just closed the gap. (at least on benchmark scores) Qwen3-Thinking (235B) is now shoulder to shoulder with the frontier giants. AI just changed forever. No strings: Apache 2. Download it now.

QQwen@Alibaba_Qwen · Jul 25

🚀 We’re excited to introduce Qwen3-235B-A22B-Thinking-2507 — our most advanced reasoning model yet! Over the past 3 months, we’ve significantly scaled and enhanced the thinking capability of Qwen3, achieving: ✅ Improved performance in logical reasoning, math, science & coding…

209

17.0K

Alan Dao@alandao_ai · Jul 25

The wait is over! Meet Step 3 — the groundbreaking multimodal LLM from StepFun! 🚀 MoE architecture (321B total params, 38B active) 💡 Rivals OpenAI o3, Gemini 2.5 Pro, and Claude Opus 4 in performance 🖥️ Optimized for China’s domestic AI chips StepFun just announced: Step 3…

机机器之心 JIQIZHIXIN@jiqizhixin · Jul 24

Another massive open-source LLM is coming soon… and it's from a Chinese company too.

586

232

110.0K

Alan Dao@alandao_ai · Jul 25

The spotlight is now on Qwen3-235B-A22B-Thinking-2507: our most powerful thinking model to date 🎉 This update brings comprehensive improvements in reasoning and general performance, and it's the culmination of our efforts in scaling RL. Enjoy! 🍻

QQwen@Alibaba_Qwen · Jul 25

204

9.0K

Alan Dao Retweeted

Mark Kretschmann@mark_k · Jul 24

OpenAI may declare AGI achievement this year to exit the Microsoft contract.

679

70.0K

Alan Dao@alandao_ai · Jul 24

In 2025 you can get all the education you need. Just use open weight model to make sure it will not end up with 200$ per month subscription.

AAshutoshShrivastava@ai_for_success · Jul 24

AI is the best teacher. You can ask the same thing in 10 different ways, and it’ll keep explaining until the concept is clear. That’s the most valuable aspect of AI for me.

Alan Dao Retweeted

AshutoshShrivastava@ai_for_success · Jul 24

AI is the best teacher. You can ask the same thing in 10 different ways, and it’ll keep explaining until the concept is clear. That’s the most valuable aspect of AI for me.

174

8.0K

Alan Dao Retweeted

EDITH@Infopulsed · Jul 22

This is cool!! Nice use of verifiers here!!

508

Alan Dao Retweeted

Mitko Vasilev@iotcoi · Jul 22

This is exactly how SLMs should be used. +LoRA adapters

376

Alan Dao@alandao_ai · Jul 23

Check this post-train out if you like small models :) x.com/casper_hansen_…

CCasper Hansen@casper_hansen_ · Jul 22

Recipe to post-train Qwen3 1.7B into a DeepResearch model What does it mean for something small to think deeply? Meet Lucy, a post‑trained Qwen3‑1.7B as a DeepResearch model based on @willccbb's verifiers. Primary Rule-based Rewards: - Answer correctness We check whether the…

564

Alan Dao@alandao_ai · Jul 24

This is sick

GGradio@Gradio · Jul 23

🎉 Big news! Google Colab now comes with Gradio pre-installed (v5.38)! No more pip install gradio needed - just import and start building AI apps instantly. Thanks to @GoogleColab team and @thechrisperry for making Gradio more accessible to millions of developers worldwide! 🙏