Junyang Lin (@JustinLin610)

Pinned

J

Junyang Lin@JustinLin610 · Jan 28

Qwen2.5-Max is here. Looks good at benchmarks and I hope you guys can give it a try and see how you feel about this new model! Qwen Chat: chat.qwenlm.ai (choose Qwen2.5-Max for the model) API is available through Alibaba Cloud service.| Happy new year!

QQwen@Alibaba_Qwen · Jan 28

The burst of DeepSeek V3 has attracted attention from the whole AI community to large-scale MoE models. Concurrently, we have been building Qwen2.5-Max, a large MoE LLM pretrained on massive data and post-trained with curated SFT and RLHF recipes. It achieves competitive…

245

1.0K

7.0K

2.0K

3.2M

Pinned

Junyang Lin Retweeted

B

Bindu Reddy@bindureddy · 9 h

Qwen 3 just dropped an open-source agentic coding model! Claims it's comparable to Sonnet-4! Will be on LiveBench and CodeLLM shortly Thanks to Qwen for keeping open source alive 👏👏

11

32

285

53

15.0K

Pinned

J

Junyang Lin@JustinLin610 · 9 h

Thanks for staying up with us!

YYuchen Jin@Yuchenj_UW · 9 h

We're now serving Qwen3-Coder-480B-A35B & Qwen3-235B-A22B-2507 at Hyperbolic! Qwen3-Coder-480B achieves results comparable to Claude Sonnet 4 on coding benchmarks, truly amazing! @JustinLin610 and @huybery are the 420 gang in China, keep shipping models until 6 AM China time!…

9

6

227

3

11.0K

J

Junyang Lin@JustinLin610 · 7 h

Qwen3-Coder is now available in Cline 🧵 New 480B parameter model with 35B active parameters. > 256K context window > comparable performance on SWE-bench to Claude Sonnet 4 > SoTA among open source models

QQwen@Alibaba_Qwen · 9 h

>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…

3

27

242

45

15.0K

J

Junyang Lin@JustinLin610 · 8 h

Quick adaptation! Thx

LLMSYS Org@lmsysorg · 8 h

✅ We’re excited to support @Qwen’s Qwen3-Coder on SGLang! With tool call parser and expert parallelism enabled, it runs smoothly with flexible configurations. Just give it a try! 🔗 github.com/zhaochenyang20…

1

0

31

1

2.0K

J

Junyang Lin@JustinLin610 · 9 h

💥 BREAKING: @Alibaba_Qwen just dropped the worlds leading coding model - a 480B Qwen3 Coder with 35B active parameters and a huge context window! This non-reasoning coder is getting near SOTA at SWE-bench, 68.7 on BFCL (function calling) and 61.8 on Aider! 🧵

QQwen@Alibaba_Qwen · 9 h

>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…

3

10

95

18

14.0K

J

Junyang Lin@JustinLin610 · 8 h

🚀

vvLLM@vllm_project · 8 h

✅ Try out @Alibaba_Qwen 3 Coder on vLLM nightly with "qwen3_coder" tool call parser! Additionally, vLLM offers expert parallelism so you can run this model in flexible configurations where it fits.

2

51

5

5.0K

J

Junyang Lin@JustinLin610 · 9 h

Nothing more frustrating than seeing "private scaffold" on public benchmark results I love that model providers like Qwen and Mistral are now reporting their results specifically using OpenHands as the scaffold--feels like we're becoming a standard here x.com/Alibaba_Qwen/s…

QQwen@Alibaba_Qwen · 9 h

>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…

2

6

68

16

6.0K

Junyang Lin Retweeted

o

orange.ai@oran_ge · 9 h

Qwen Code !!!

13

9

114

54

17.0K

Junyang Lin Retweeted

H

Haihao Shen@HaihaoShen · 17 h

🥳INT4 model for updated Qwen3-235B-A22B: huggingface.co/Intel/Qwen3-23… vLLM MoE seems not working well; yet HF transformers can run pretty well.

3

12

76

17

12.0K

J

Junyang Lin@JustinLin610 · 9 h

It's out! and you can already run inference on the HF model page thanks to @hyperbolic_labs! huggingface.co/Qwen/Qwen3-Cod…

cclem 🤗@ClementDelangue · 10 h

As always, you'll see it on HF first! huggingface.co/Qwen

13

15

112

14

20.0K

J

Junyang Lin@JustinLin610 · 9 h

🚀 Meet Qwen3-Coder, our most advanced agentic code model yet! Kicking off with the open-sourced model Qwen3-Coder-480B-A35B-Instruct, a 480B MoE with 32B active parameters for top coding & agentic tasks. Plus, we're open-sourcing Qwen Code, a CLI tool for agentic programming!…

QQwen@Alibaba_Qwen · 9 h

>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…

14

50

338

47

22.0K

Junyang Lin Retweeted

K

Kevin Nelson@BootstrAppdAI · 10 h

Qwen 3 Coder is on another level . I had it build a sim based on some scaffolds we are trying . The model left me a message in the sim we built !!!!!

0

8

43

13

4.0K

J

Junyang Lin@JustinLin610 · 9 h

Try here! Pokémon! modelscope.cn/studios/Qwen/Q… huggingface.co/spaces/Qwen/Qw…

QQwen@Alibaba_Qwen · 9 h

>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…

0

4

19

2

3.0K

J

Junyang Lin@JustinLin610 · 9 h

A perfect coding model for MLX on Apple silicon.. Qwen delivered again. Runs quite fast on an M3 Ultra. Running the 4-bit quantized with mlx-lm:

QQwen@Alibaba_Qwen · 9 h

>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…

19

75

832

519

98.0K

Junyang Lin Retweeted

A

AK@_akhaliq · 9 h

Qwen3-Coder-480B-A35B-Instruct + @hyperbolic_labs is now available in anycoder for vibe coding

3

9

77

21

35.0K

J

Junyang Lin@JustinLin610 · 9 h

this is what is not small! boys spent so much time building the Qwen3-Coder after Qwen2.5-Coder. it is much bigger, but based on MoE, and way stronger and smarter than before! not sure we can say competitive with claude sonnet 4 but might be for sure a really good coding agent.…

QQwen@Alibaba_Qwen · 9 h

>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…

51

65

762

79

39.0K

J

Junyang Lin@JustinLin610 · 13 h

not small tonight

85

32

746

37

99.0K

J

Junyang Lin@JustinLin610 · Jul 21

Note that this is a non-thinking model. Thinking model on the way!

QQwen@Alibaba_Qwen · Jul 21

Bye Qwen3-235B-A22B, hello Qwen3-235B-A22B-2507! After talking with the community and thinking it through, we decided to stop using hybrid thinking mode. Instead, we’ll train Instruct and Thinking models separately so we can get the best quality possible. Today, we’re releasing…

49

70

1.0K

112

60.0K

J

Junyang Lin@JustinLin610 · Jul 21

A small update on Qwen3-235B-A22B, but a big improvement on its quality! We thought about this decision for a long time, but we believe that providing better-quality performance is more important than the unification at this moment. We are still continuing our research on hybrid…

QQwen@Alibaba_Qwen · Jul 21

Bye Qwen3-235B-A22B, hello Qwen3-235B-A22B-2507! After talking with the community and thinking it through, we decided to stop using hybrid thinking mode. Instead, we’ll train Instruct and Thinking models separately so we can get the best quality possible. Today, we’re releasing…

38

94

891

156

60.0K

J

Junyang Lin@JustinLin610 · Jul 21

no hybrid thinking mode tonight

22

11

381

13

45.0K