Qwen
@Alibaba_Qwen
Open foundation models for AGI.
>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…

Qwen/Qwen3-Coder with tool calling is supported in LM Studio 0.3.20, out now. 480B parameters, 35B active. Requires about 250GB to run locally.
🚀 Introducing Qwen3-MT – our most powerful translation model yet! Trained on trillions of multilingual tokens, it supports 92+ languages—covering 95%+ of the world’s population. 🌍✨ 🔑 Why Qwen3-MT? ✅ Top-tier translation quality ✅ Customizable: terminology control, domain…

Qwen3-Coder is now available in Cline 🧵 New 480B parameter model with 35B active parameters. > 256K context window > comparable performance on SWE-bench to Claude Sonnet 4 > SoTA among open source models
>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…
Qwen has passed Moonshot and xAI in token marketshare 👀
🟣New: Qwen3-Coder by @Alibaba_Qwen - 480B params (35B active) - Native 256K context length, extrapolates to 1M - Outperforms Kimi, o3, DeepSeek, and more on SWE-Bench Verified (69.6%) 👀 Now live, starting at $1/M tokens 👇
✅ We’re excited to support @Qwen’s Qwen3-Coder on SGLang! With tool call parser and expert parallelism enabled, it runs smoothly with flexible configurations. Just give it a try! 🔗 github.com/zhaochenyang20…
✅ Try out @Alibaba_Qwen 3 Coder on vLLM nightly with "qwen3_coder" tool call parser! Additionally, vLLM offers expert parallelism so you can run this model in flexible configurations where it fits.
>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…