Wanchao Liang

@wanchao_

building @thinkymachines ex-PyTorch @ Meta. Author of PyTorch DTensor and TorchTitan. Opinions are my own

Joined May 2022

222Following

1KFollowers

Pinned

Wanchao Liang Retweeted

Tanishq Abraham is at ICML@iScienceLuvr · Apr 26, 2024

The @PyTorch team is developing a library for large model training called torchtitan 👀 They have scripts to train Llama-3 from scratch The library went public today on GitHub but it is still in pre-release state & active development Check it out → github.com/pytorch/torcht…

198

1.0K

629

114.0K

Wanchao Liang@wanchao_ · Jul 19

I’ll be presenting TorchTitan: a PyTorch native platform for training foundation models tomorrow at the ICML @ESFoMo workshop! Come and say Hi!

EES-FoMo@ICML2025@ESFoMo · Jul 18

Looking forward to seeing everyone for ES-FoMo part three tomorrow! We'll be in East Exhibition Hall A (the big one), and we've got an exciting schedule of invited talks, orals, and posters planned for you tomorrow. Let's meet some of our great speakers! 1/

10.0K

Wanchao Liang@wanchao_ · Jul 16

Excited to share that I joined @thinkymachines recently! It’s been an incredible experience so far working alongside many talented folks here. We are building multimodal AI that are collaborative with human, as well as a great research infra to accelerate AI and science!

MMira Murati@miramurati · Jul 15

Thinking Machines Lab exists to empower humanity through advancing collaborative general intelligence. We're building multimodal AI that works with how you naturally interact with the world - through conversation, through sight, through the messy way we collaborate. We're…

320

36.0K

Wanchao Liang Retweeted

Zach Mueller@TheZachMueller · Jun 22

This is starting to feel more like a conference, less like a course every day. We're now having the amazing @wanchao_ as a guest speaker talking about TorchTitan and DTensors!

8.0K

Wanchao Liang Retweeted

PyTorch@PyTorch · Jun 20

torchft + TorchTitan: 1200+ failures, no checkpoints, model convergence. A Llama 3 model was trained across 300 L40S GPUs with synthetic failures every 15s. No restarts. No rollbacks. Just asynchronous recovery and continued progress. 📘 hubs.la/Q03t1Z0b0 #PyTorch…

398

172

73.0K

Wanchao Liang Retweeted

Simo Ryu@cloneofsimo · Feb 7

Pytorch docs are sometimes lacking, especially new features lack of real-life code examples. You would read through implementations or codebases. Now, here is @OpenAI deep research's result on DTensor. It basically read through all the torch doc / github issues and Its really…

376

240

35.0K

Wanchao Liang Retweeted

Horace He@cHHillee · Aug 7

For too long, users have lived under the software lottery tyranny of fused attention implementations. No longer. Introducing FlexAttention, a new PyTorch API allowing for many attention variants to enjoy fused kernels in a few lines of PyTorch. pytorch.org/blog/flexatten… 1/10

269

1.0K

259.0K

Wanchao Liang Retweeted

Wei (Will) Feng@weifengpy · Aug 7

We have been working on PyTorch native float8 and FSDP2 for distributed training. Check out TorchTitan and TorchAO/float8 dev-discuss.pytorch.org/t/enabling-flo… with Andrew Gu, @wanchao_ , @drisspg , @vkuzo , @brian_hirsh

1.0K

Wanchao Liang Retweeted

PyTorch@PyTorch · Apr 16, 2024

Announcing the alpha release of torchtune! torchtune is a PyTorch-native library for fine-tuning LLMs. It combines hackable memory-efficient fine-tuning recipes with integrations into your favorite tools. Get started fine-tuning today! Details: hubs.la/Q02t214F0

298

1.0K

454

156.0K

Wanchao Liang Retweeted

Mihir Patel@mvpatel2000 · Mar 22, 2024

🚨New🌟blog✍️ on ⏩ maximizing🌙 FLOPS 🚀 Training large models requires maximizing flops/GPU, especially at scale. Excited to share a few of the cool tricks in thread👀. 1/N

189

164

48.0K

Wanchao Liang@wanchao_ · Nov 8, 2023

This is a good question, it gets to the root of the tradeoff between performance and flexibility so how do PyTorch folks think about this? Long answer: So if we're in a world where a single base model can be fine-tuned over all tasks and we're fairly certain that this base model…

kkache@yacineMTB · Nov 8, 2023

why use pytorch/jax at all? why don't people just write CUDA programs?

596

500

190.0K

Wanchao Liang Retweeted

PyTorch@PyTorch · Feb 27, 2023

Learn more in advance: docs.google.com/presentation/d…

11.0K

Wanchao Liang Retweeted

PyTorch@PyTorch · Feb 27, 2023

PyTorch 2.0 Q&A: 🗓️ March 1 ⏰ 11am PT ✅ Register: hubs.la/Q01DvW9Q0 Introduction to 2-D Parallelism (FSDP + Tensor Parallel) to train large scale ViT models and Introduction to PyTorch DistributedTensor. Join @wanchao_ & Junjie Wang Host DA: @shshnkp

25.0K

Wanchao Liang@wanchao_ · Dec 2, 2022

Excited about the future of PyTorch 2.0!

YYann LeCun@ylecun · Dec 2, 2022

The PyTorch roadmap by @soumithchintala. Points: More speed with the same flexibility, dynamic shapes & graphs, with the TorchDynamo+TorchInductor compiler. A more compact backend. Simpler distributed training.

Wanchao Liang Retweeted

PyTorch@PyTorch · Dec 2, 2022

We just introduced PyTorch 2.0 at the #PyTorchConference, introducing torch.compile! Available in the nightlies today, stable release Early March 2023. Read the full post: bit.ly/3VNysOA 🧵below! 1/5

504

2.0K

134

Wanchao Liang Retweeted

Dmytro Dzhulgakov@dzhulgakov · Dec 2, 2022

Excited to see many awesome community members in person at #PyTorchConference tomorrow! Some major announcements are coming too…

Wanchao Liang Retweeted

josh@jdjkelly · Nov 30, 2022

Google is done. Compare the quality of these responses (ChatGPT)

899

4.0K

26.0K

5.0K

Wanchao Liang Retweeted

Edward Z. Yang@ezyang · Nov 17, 2022

Wanchao has posted an RFC for distributed tensors in PyTorch at github.com/pytorch/pytorc… ; if you're interested in tensor parallel distributed training check it out! He'll also be at PyTorch Conference, if you want to chat with him IRL (sign up at pytorchconference22.splashthat.com)