Jiatao Gu

@thoma_gu

Assistant Prof @CIS_Penn and ML Researcher at @Apple (MLR) | exFAIRer | PhD @HKUniversity | Research on Generative AI for multimodal. また日本語もできます。

New York, USA

Joined October 2012

2KFollowing

5KFollowers

Pinned

Jiatao Gu@thoma_gu · Jul 14

I will be attending #ICML2025 Tue to Sat at Vancouver. Please also come and check our oral presentation and spotlight poster on TARFlow on Thu: icml.cc/virtual/2025/p… Looking forward to chatting with old and new friends on next-gen generative models and world models!!

3.0K

Jiatao Gu Retweeted

Yuanzhi@yuanzhi_zhu · Jul 22

did you miss Kaleido Diffusion by @thoma_gu ? check eq(7)

549

Jiatao Gu@thoma_gu · Jun 24

Thanks @9to5mac for summarizing our research on TARFlow/STARFlow! It is an exciting direction of reviving normalizing flow with modern scalable techniques… and more will come!

99to5Mac@9to5mac · Jun 23

Apple Research just unearthed a forgotten AI technique and is using it to generate images 9to5mac.com/2025/06/23/app… by @mvcmendes

2.0K

Jiatao Gu@thoma_gu · Jun 21

I like our Vid2Sim for two main reasons: 1. The inverse physics problem can be efficiently tackled through a generalized feed-forward prediction of physical properties + a lightweight optimization accelerated by the proposed Neural Jacobian. 2. Its handle-based 3D representation…

ZZhiyang (Frank) Dou@frankzydou · Jun 20

Check out 🌟Vid2Sim: Generalizable, Video-based Reconstruction of Appearance, Geometry & Physics for Mesh-Free Simulation #CVPR2025, from @LingjieLiu1’s lab at UPenn. Congrats to @MorPhLingXD! Vid2Sim aims to achieve system identification by reconstructing geometry, appearance,…

5.0K

Jiatao Gu Retweeted

Chuang Gan@gan_chuang · Jun 20

World Simulator, reimagined — now alive with humans, robots, and their vibrant society unfolding in 3D real-world geospatial scenes across the globe! 🚀 One day soon, humans and robots will co-exist in the same world. To prepare, we must address: 1️⃣ How can robots cooperate or…

243

63.0K

Jiatao Gu Retweeted

Lingjie Liu@LingjieLiu1 · Jun 15

Come visit our posters today and chat with us! 🕥 10:30–12:30 – Poster #153 🔹 Ego4D: Egocentric Human Motion Capture & Understanding from Multi-Modal Input 🔗 jianwang-mpi.github.io/ego4o/ 🕓 16:00–18:00 – Poster #37 🔹 Vid2Sim: Generalizable, Video-based Reconstruction of…

3.0K

Jiatao Gu@thoma_gu · Jun 15

Please drop by and check our highlight poster tomorrow at #CVPR2025! ExHall D Poster #60 Sun 15 Jun 10:30 a.m. CDT — 12:30 p.m. CDT Great work by our @Apple intern @QihangZhang0224 and look forward to more exploration on explicit 3D generation! zqh0253.github.io/wvd/

JJiatao Gu@thoma_gu · Feb 26

Excited to share our paper "World-consistent Video Diffusion (WVD)" has been accepted at #CVPR2025! arxiv.org/abs/2412.01821 Huge congrats to our amazing intern @QihangZhang0224 and colleagues @zhaisf @itsbautistam @KJHMiao @alexttoshev & Josh Susskind!

5.0K

Jiatao Gu@thoma_gu · Jun 11

Congrats @RickyTQChen to the nice work! This reminds me of our earlier work Levenshtein Transformers (x.com/thoma_gu/statu…) at FAIR! We learned non-autoregressive insertion-deletion network for machine translation. Good memories before the LLM era!

RRicky T. Q. Chen@RickyTQChen · Jun 11

Padding in our non-AR sequence models? Yuck. 🙅 👉 Instead of unmasking, our new work *Edit Flows* perform iterative refinements via position-relative inserts and deletes, operations naturally suited for variable-length sequence generation. Easily better than using mask tokens.

3.0K

Jiatao Gu Retweeted

AK@_akhaliq · Jun 9

Apple presents STARFlow Scaling Latent Normalizing Flows for High-resolution Image Synthesis

245

24.0K

Jiatao Gu@thoma_gu · Jun 11

Feel free to drop by our talks at: June 11 Morning (202 B): vision-x-nyu.github.io/scalable-visio… June 11 Afternoon (Grand A2): generative-vision.github.io/workshop-CVPR-… June 12 Afternoon (103 A): vgm-cvpr.github.io

JJiatao Gu@thoma_gu · Jun 11

I will be attending #CVPR2025 and presenting our latest research at Apple MLR! Specifically, I will present our highlight poster--world consistent video diffusion (cvpr.thecvf.com/virtual/2025/p…), and three workshop invited talks which includes our recent preprint ★STARFlow★! (0/n)

3.0K