Shuangfei Zhai (@zhaisf)

Pinned

S

We attempted to make Normalizing Flows work really well, and we are happy to report our findings in paper arxiv.org/pdf/2412.06329, and code github.com/apple/ml-tarfl…. [1/n]

zhaisf's tweet image. We attempted to make Normalizing Flows work really well, and we are happy to report our findings in paper arxiv.org/pdf/2412.06329, and code github.com/apple/ml-tarfl…. [1/n]

4

44

217

115

46.0K

S

Shuangfei Zhai@zhaisf · 9 h

Normalizing Flows explained in terms of catching rabbits👇 There is a dense cluster of steep hills (figure 1), and at the top of each hill lives a rabbit. Catching rabbits here is very difficult, because you’d have to climb and down and sometimes you may get lost in the valleys.…

zhaisf's tweet image. Normalizing Flows explained in terms of catching rabbits👇

There is a dense cluster of steep hills (figure 1), and at the top of each hill lives a rabbit. Catching rabbits here is very difficult, because you’d have to climb and down and sometimes you may get lost in the valleys.…

0

20

10

554

S

Shuangfei Zhai@zhaisf · Jul 26

Score matching has become a popular concept, but its original form (often referred to as implicit SM) as proposed in the JMLR 2005 paper is rarely brought up nowadays. I first came across SM during my PhD around 2015, and was immediately drawn into it. I thought that it could be…

zhaisf's tweet image. Score matching has become a popular concept, but its original form (often referred to as implicit SM) as proposed in the JMLR 2005 paper is rarely brought up nowadays.

I first came across SM during my PhD around 2015, and was immediately drawn into it. I thought that it could be…

1

0

15

18

889

S

Shuangfei Zhai@zhaisf · Jul 26

On a related note, I also like to think of weight decay (in the AdamW style, not Adam + L2 regularization) as part of the LR schedule. The basic logic is: more WD -> smaller weight norm -> larger relative update with a fixed LR == increased LR. And the regularization benefit from…

SShuangfei Zhai@zhaisf · Jul 25

We all know about learning rate schedules, but how come scheduling the momentum terms (ie, betas in Adam) is not a thing? A few years back, I was working on a problem that involves an image reconstruction loss ||f(x) - y||^2. In order to make the output look nice, a typical…

0

9

2

659

S

Shuangfei Zhai@zhaisf · Jul 25

We all know about learning rate schedules, but how come scheduling the momentum terms (ie, betas in Adam) is not a thing? A few years back, I was working on a problem that involves an image reconstruction loss ||f(x) - y||^2. In order to make the output look nice, a typical…

zhaisf's tweet image. We all know about learning rate schedules, but how come scheduling the momentum terms (ie, betas in Adam) is not a thing?

A few years back, I was working on a problem that involves an image reconstruction loss ||f(x) - y||^2. In order to make the output look nice, a typical…

7

13

253

214

22.0K

S

Shuangfei Zhai@zhaisf · Jul 15

Congratulations to @DaniloJRezende and Shakir Mohamed! This is how first learned about normalizing flows. Their paper on VAEs could/should have also gotten a nod.

KKosta Derpanis@CSProfKGD · Jul 15

#ICML2025 test of time award

1

6

54

11

4.0K

S

Shuangfei Zhai@zhaisf · Jul 17

Normalizing Flows are coming back to life. I'll be attending #ICML2025 on Jul 17 to present TarFlow -- with the Oral presentation at 4:00 PM in West Ballroom A and the poster at 4:30 PM in East Exhibition Hall A-B #E-2911. Stop by and I promise it will be worth your time.

SShuangfei Zhai@zhaisf · Dec 10

We attempted to make Normalizing Flows work really well, and we are happy to report our findings in paper arxiv.org/pdf/2412.06329, and code github.com/apple/ml-tarfl…. [1/n]

0

5

55

7

5.0K

S

Shuangfei Zhai@zhaisf · Jun 23

Small plug, not really advertised but we similarly showed how to perform temperature based control and composition of separately trained diffusion models via SMC and the Feynman Kac model formalism, with score distillation of the energy at AISTATS last year - Diversity control…

KKirill Neklyudov@k_neklyudov · Jun 17

Why do we keep sampling from the same distribution the model was trained on? We rethink this old paradigm by introducing Feynman-Kac Correctors (FKCs) – a flexible framework for controlling the distribution of samples at inference time in diffusion models! Without re-training…

3

23

145

78

19.0K

S

Shuangfei Zhai@zhaisf · Jun 11

Feel free to drop by our talks at: June 11 Morning (202 B): vision-x-nyu.github.io/scalable-visio… June 11 Afternoon (Grand A2): generative-vision.github.io/workshop-CVPR-… June 12 Afternoon (103 A): vgm-cvpr.github.io

JJiatao Gu@thoma_gu · Jun 11

I will be attending #CVPR2025 and presenting our latest research at Apple MLR! Specifically, I will present our highlight poster--world consistent video diffusion (cvpr.thecvf.com/virtual/2025/p…), and three workshop invited talks which includes our recent preprint ★STARFlow★! (0/n)

0

2

19

3

3.0K

Shuangfei Zhai Retweeted

T

Tanishq Abraham back from ICML@iScienceLuvr · Jun 9

STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis "We present STARFlow, a scalable generative model based on normalizing flows that achieves strong performance on high-resolution image synthesis"

6

36

233

123

44.0K

S

Shuangfei Zhai@zhaisf · Jun 11

I will be attending #CVPR2025 and presenting our latest research at Apple MLR! Specifically, I will present our highlight poster--world consistent video diffusion (cvpr.thecvf.com/virtual/2025/p…), and three workshop invited talks which includes our recent preprint ★STARFlow★! (0/n)

TTanishq Abraham back from ICML@iScienceLuvr · Jun 9

STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis "We present STARFlow, a scalable generative model based on normalizing flows that achieves strong performance on high-resolution image synthesis"

2

19

81

16

13.0K

S

Shuangfei Zhai@zhaisf · Jun 6

Yet another new paper that tries the exact same Jacobi iteration idea! I guess good ideas are all similar 😅 arxiv.org/abs/2505.24791.

SShuangfei Zhai@zhaisf · May 23

TarFlow has great training efficiency, but its sampling can be slow. This new paper applies (training free) Jacobi iteration to TarFlow and achieves up to 5x sampling speedup without loss of quality. Very promising results! arxiv.org/abs/2505.12849. Bonus point, they also open…

0

1

17

7

1.0K

S

Shuangfei Zhai@zhaisf · Jun 4

Cool new paper "Normalizing Flows are Capable Models for RL" -- delivers that same message as our TarFlow paper did but demonstrates it in the RL domain!

RRaj Ghugare@GhugareRaj · Jun 3

Normalizing Flows (NFs) check all the boxes for RL: exact likelihoods (imitation learning), efficient sampling (real-time control), and variational inference (Q-learning)! Yet they are overlooked over more expensive and less flexible contemporaries like diffusion models. Are NFs…

1

9

2

597