Tiezhen WANG

@Xianbao_QIAN

Engineer at HuggingFace, ex-Googler on TFLite / micro. Ideas are my own.

Joined November 2022

2KFollowing

7KFollowers

Pinned

Tiezhen WANG@Xianbao_QIAN · Jan 27

My 2025 Predictions for the OS/AI Frontier 1. Core Trend: Pure RL-Driven Superhuman Reasoning Pure reinforcement learning will unlock human-bias-free, nimble models capable of self-evolved reasoning with rule-based rewards, surpassing human performance in specialized domains…

QQwen@Alibaba_Qwen · Jan 26

We're leveling up the game with our latest open-source models, Qwen2.5-1M ! 💥 Now supporting a 1 MILLION TOKEN CONTEXT LENGTH 🔥 Here's what’s new: 1️⃣ Open Models: Meet Qwen2.5-7B-Instruct-1M & Qwen2.5-14B-Instruct-1M —our first-ever models handling 1M-token contexts! 🤯 2️⃣…

22.0K

Tiezhen WANG@Xianbao_QIAN · 7 h

Intern-S1, a new multimodal model from @intern_lm - 235B MoE + 6B vision encoder - 5T multimodal tokens & 2.5T scientific-domain tokens - great model for AI4S research - support tool calling capabilities Model on @huggingface: huggingface.co/internlm/Inter…

Xianbao_QIAN's tweet image. Intern-S1, a new multimodal model from @intern_lm

- 235B MoE + 6B vision encoder
- 5T multimodal tokens &amp; 2.5T scientific-domain tokens
- great model for AI4S research
- support tool calling capabilities

Model on @huggingface: huggingface.co/internlm/Inter…

13.0K

Tiezhen WANG Retweeted

Denny Zhou@denny_zhou · Jul 24

Slides for my lecture “LLM Reasoning” at Stanford CS 25: dennyzhou.github.io/LLM-Reasoning-… Key points: 1. Reasoning in LLMs simply means generating a sequence of intermediate tokens before producing the final answer. Whether this resembles human reasoning is irrelevant. The crucial…

341

2.0K

3.0K

175.0K

Tiezhen WANG@Xianbao_QIAN · Jul 23

Seed Prover solved 4 out of 6 IMO questions in 3 days and got Silver. Proof: github.com/ByteDance-Seed… Big congratulations to @huajian_xin ! Now you know what I'm going to kindly ask: Would you consider open sourcing it :D

Xianbao_QIAN's tweet image. Seed Prover solved 4 out of 6 IMO questions in 3 days and got Silver.

Proof: github.com/ByteDance-Seed…

Big congratulations to @huajian_xin !

Now you know what I'm going to kindly ask: Would you consider open sourcing it :D

5.0K

Tiezhen WANG@Xianbao_QIAN · Jul 23

A new Lumina‑Image‑2.0 fine-tune by @NetaArt_AI

427

Tiezhen WANG@Xianbao_QIAN · Jul 22

It turns out that Qwen/Qwen3-235B-A22B-Instruct-2507 is just an appetizer, the new qwen3 coder is the main meal! Amazing Apache 2 model that claims to beat Sonnet-4 in many dimensions.

Xianbao_QIAN's tweet image. It turns out that Qwen/Qwen3-235B-A22B-Instruct-2507 is just an appetizer, the new qwen3 coder is the main meal!

Amazing Apache 2 model that claims to beat Sonnet-4 in many dimensions.

1.0K

Tiezhen WANG@Xianbao_QIAN · Jul 20

Nice article on how to tune the performance of PD disaggregation for very LARGE models: huggingface.co/blog/yiakwy-xp…

Xianbao_QIAN's tweet card. Comprehensive Study of H800x104 DGX SuperPod Disaggregation Strategy in SGLang v0.4.8

772

Tiezhen WANG@Xianbao_QIAN · Jul 17

The true value of open source is not just on the released foundational model itself but more importantly on the fine-tuned model that it enables. Derived from SmolLM3-3B, quantized and adapted to NPU, AXERA-TECH/SmolLM3-3B is a great example to show the powerful smollm ecosystem…

450

Tiezhen WANG@Xianbao_QIAN · Jul 17

Let's talk about HopeJR in 2 mins! youtube.com/watch?v=G6tt_8…

TTiezhen WANG@Xianbao_QIAN · Jul 17

Livestream of building HopeJR of your own. 111 people online. Almost 4000 likes after 5 hours of live streaming. Amazing! Thanks @bilibili_en for the support. Link below:

633

Tiezhen WANG@Xianbao_QIAN · Jul 17

An interesting benchmark.

700

Tiezhen WANG@Xianbao_QIAN · Jul 17

Here it comes: youtube.com/watch?v=qlKpr0…

TTiezhen WANG@Xianbao_QIAN · Jul 16

Tonight, we’re hosting a live conversation with the authors behind recent most-liked @HuggingPapers on AI-for-Science evaluation and agents. Got a burning question for them? Drop it below and we’ll try to work it into the discussion. - ScienceBoard: Evaluating Multimodal…

477

Tiezhen WANG@Xianbao_QIAN · Jul 16

Xianbao_QIAN's tweet image. Tonight, we’re hosting a live conversation with the authors behind recent most-liked @HuggingPapers on AI-for-Science evaluation and agents.

Got a burning question for them? Drop it below and we’ll try to work it into the discussion.

- ScienceBoard: Evaluating Multimodal…

869

Tiezhen WANG@Xianbao_QIAN · Jul 16

Great technical report on Zhihu

ZZhihu Frontier@ZhihuFrontier · Jul 16

🧠Zhihu contributor & @Kimi_Moonshot dev Dylan shares his thoughts on building Kimi K2: Why RL? Because compute may be infinite, but data is not. RL improves data efficiency - that's why we invest in scaling test-time compute. Why large models? Why Muon optimizer? → It's all…

432

Tiezhen WANG@Xianbao_QIAN · Jul 16

HopeJR: Hello world! How about the first humanoid building livestream? As we're finishing up assembling the first arm of HopeJR. I'm thinking about having a half day live streaming tmr to show the progress as we build. Tmr night, Martino and @seeedstudio will go through the…

Xianbao_QIAN's tweet image. HopeJR: Hello world!

How about the first humanoid building livestream?

As we're finishing up assembling the first arm of HopeJR. I'm thinking about having a half day live streaming tmr to show the progress as we build. Tmr night, Martino and @seeedstudio will go through the…

1.0K