Liliang Ren (@liliang_ren)

Pinned

L

Liliang Ren@liliang_ren · Jul 18

We’re open-sourcing the pre-training code for Phi4-mini-Flash, our SoTA hybrid model that delivers 10× faster reasoning than Transformers — along with μP++, a suite of simple yet powerful scaling laws for stable large-scale training. 🔗 github.com/microsoft/Arch… (1/4)

liliang_ren's tweet card. Simple & Scalable Pretraining for Neural Architecture Research - microsoft/ArchScale

13

216

1.0K

135.0K

L

Liliang Ren@liliang_ren · Jul 19

See our work in the workshop today. If you are looking for opportunities to work on efficient model architecture or whatever to make the training or inference run much faster with thousands or more gpus, please come to talk to us or dm me. We are hiring.

LLiliang Ren@liliang_ren · Jul 18

We’re open-sourcing the pre-training code for Phi4-mini-Flash, our SoTA hybrid model that delivers 10× faster reasoning than Transformers — along with μP++, a suite of simple yet powerful scaling laws for stable large-scale training. 🔗 github.com/microsoft/Arch… (1/4)

1

27

5

5.0K

L

Liliang Ren@liliang_ren · Jul 19

We are hiring! If you are interested in efficient architecture or making training and inference on thousands of GPUs much faster, please feel free to dm me or @WeizhuChen! We are doing RL on very large scales!

LLiliang Ren@liliang_ren · Jul 18

We’re open-sourcing the pre-training code for Phi4-mini-Flash, our SoTA hybrid model that delivers 10× faster reasoning than Transformers — along with μP++, a suite of simple yet powerful scaling laws for stable large-scale training. 🔗 github.com/microsoft/Arch… (1/4)

1

16

295

189

36.0K

Liliang Ren Retweeted

W

Weizhu Chen@WeizhuChen · Jul 17

Just arrived at ICML. Please drop me a message if you are here and like to chat. We are hiring.

7

8

178

30

21.0K

Liliang Ren Retweeted

N

Nicholas Roberts@nick11roberts · Jul 14

🎉 Excited to share that our paper "Pretrained Hybrids with MAD Skills" was accepted to @COLM_conf 2025! We introduce Manticore - a framework for automatically creating hybrid LMs from pretrained models without training from scratch. 🧵[1/n]

1

18

44

3

6.0K

Liliang Ren Retweeted

R

Rohan Paul@rohanpaul_ai · Jul 10

Microsoft just dropped Phi-4-mini-flash-reasoning. - built on a new hybrid architecture, - 10X higher throughput and a 2 to 3X reduction in latency - significantly faster inference without sacrificing reasoning performance. Microsoft swaps most of that heavy work for a lean…

8

26

159

87

12.0K

Liliang Ren Retweeted

M

Microsoft Azure@Azure · Jul 9

Meet Phi-4-mini-flash-reasoning: a fast, low-latency SLM built for scale with its novel SambaY architecture. Available on Azure AI Foundry and Hugging Face. Experience advanced reasoning capabilities here: msft.it/6018SAmHn

9

28

124

24

22.0K

L

Liliang Ren@liliang_ren · Jul 8

Excited to see the next-gen tokenizer-free model that can filter out redundancy in sequences efficiently (?)

AAlbert Gu@_albertgu · Jul 8

I converted one of my favorite talks I've given over the past year into a blog post. "On the Tradeoffs of SSMs and Transformers" (or: tokens are bullshit) In a few days, we'll release what I believe is the next major advance for architectures.

0

4

1

577

L

Liliang Ren@liliang_ren · Jun 17

Research with amazing collaborators @JizeJiang, @MeitangLi, and @JingchengYang, guided by great advisors and supported by the generous help of talented researchers @BowenJin13, @XingyuFu2, and many open-source contributors (easyr1, verl, vllm... etc).

JJize Jiang@JizeJiang · Jun 17

Excited to introduce VTool-R1! We’ve trained VLMs to “think visually” using RL, blending Python-based 🖼️visual edits with💡textual Chain-of-Thought reasoning. Our trained qwen2.5-VL-32B surpasses GPT-4o on ChartQA & TableVQA, and even the compact qwen2.5-VL-7B significantly…

0

14

29

9

4.0K