Scott Condron
@_ScottCondron
Helping build AI/ML dev tools at @weights_biases. I post about machine learning, data visualisation, software tools.
Here's an animation of a @PyTorch DataLoader. It turns your dataset into a shuffled, batched tensors iterator. (This is my first animation using @manim_community, the community fork of @3blue1brown's manim) Here's a little summary of the different parts for those curious: 1/5
🚨 @Alibaba_Qwen’s Qwen3-Thinking is now live in W&B Inference (FP16 and 10¢ in/out)! Open-weight SOTA that runs neck-and-neck with o3/Gemini 2.5 Pro and 40x cheaper? Crazy work. Here's the tale of the tape for this model and how you can get $20 in credits. ↓
Just tried opencode first time... I am impressed... use this config for wandb inference with qwen coder 480b. It just... it just worked wtf. On First try. Exact config i used for opencode.json ( make that anywhere and replace my lobotollama with your project name): {…
claude code shipped subagents today so i guess we gotta too ...and it's done - available in opencode 0.3.65 i made a subagent to teach me a lesson if i get too cocky - you all know i need it
TL;DR: Open source AI just closed the gap. (at least on benchmark scores) Qwen3-Thinking (235B) is now shoulder to shoulder with the frontier giants. AI just changed forever. No strings: Apache 2. Download it now.
🚀 We’re excited to introduce Qwen3-235B-A22B-Thinking-2507 — our most advanced reasoning model yet! Over the past 3 months, we’ve significantly scaled and enhanced the thinking capability of Qwen3, achieving: ✅ Improved performance in logical reasoning, math, science & coding…
Wow the new qwen reasoner at only 232B params is as good as the top closed frontier lab models Big day for OS
It was missing, so I added @AnthropicAI Opus 4 Thinking and @OpenAI o3 benchmark results to the comparison mix chart 🆚🔎 Vibe check pending, but on benchmarks it seems that we got an open model competitive with Opus 4 / o3 / Gemini 2.5 🤯
last two runs of the biggest scale project ive ever done 🥲 training 1.5b, 3b, 7b, 14b, 32b models - pretraining + rejection sampling to build a ds + supervised finetuning + reinforcement learning now time to write
"i can fix her"
here's qwen7b instruct falling off a cliff
Implemented IPPO from scratch in Pytorch IPPO or Independent PPO is a MARL concept where each agent is independent and has its critic. Like PPO but repeated n times. This is considered as the baseline for many MARL algorithms to compare with like MAPPO etc (my next post) This…
300+ engineers came to SF to test out the latest advancements in AI and multi-agent frameworks The grand prize? A robot dog. Here are the demos from the @weights_biases WeaveHacks AI hackathon (🧵):
Reinforcement learning is powerful, but not always practical. That’s why the new open-source framework ART caught our eye. It makes RL usable for LLM agents, and in this walkthrough, you’ll see how it trains a small open model to beat GPT-4o-mini at Tic-Tac-Toe.
I'm going around telling anyone who will listen about how @Kimi_Moonshot Kimi K2 was trained
We were first to market with NVIDIA H100, GB200, and GB300. Today we are also the first cloud provider to release Qwen3 Coder. AND at the most competitive pricing. 🔥🔥
🚨 New @Alibaba_Qwen model drop: Qwen3-Coder & Qwen3-2507 are now both live in W&B Inference! (unbeatable prices) @JustinLin610 and team really cooked. 🔥 Here's the scoop on these new open SOTA models and how you can get $20 of W&B credits to try them out yourself. 👇
>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…
Here it is 🔥 The glorious new open SOTA coding model (and it's non reasoning brother) humming on beautiful H200s provided by CoreWeave and cost pennies! We really went all out for this one, day 0 inference support, huge huge effort from the @CoreWeave and @weights_biases…
🚨 New @Alibaba_Qwen model drop: Qwen3-Coder & Qwen3-2507 are now both live in W&B Inference! (unbeatable prices) @JustinLin610 and team really cooked. 🔥 Here's the scoop on these new open SOTA models and how you can get $20 of W&B credits to try them out yourself. 👇
This might be the best AI email I've received. Called me Roger and said [similar company] 😚👌
![_ScottCondron's tweet image. This might be the best AI email I've received.
Called me Roger and said [similar company] 😚👌](https://pbs.twimg.com/media/GwidkkWWQAAu2_9.png)
this is what is not small! boys spent so much time building the Qwen3-Coder after Qwen2.5-Coder. it is much bigger, but based on MoE, and way stronger and smarter than before! not sure we can say competitive with claude sonnet 4 but might be for sure a really good coding agent.…
>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…