Gashon Hussein

@GashonHussein

Stanford

127.0.0.1/8

Joined March 2022

422Following

560Followers

Pinned

Gashon Hussein@GashonHussein · Apr 7

Excited to share our new paper, "One-Minute Video Generation with Test-Time Training (TTT)" in collaboration with NVIDIA. We augment a pre-trained Transformer with TTT-layers and finetune it to generate one-minute Tom and Jerry cartoons with strong temporal and spatial…

GashonHussein's tweet image. Excited to share our new paper, "One-Minute Video Generation with Test-Time Training (TTT)" in collaboration with NVIDIA.

We augment a pre-trained Transformer with TTT-layers and finetune it to generate one-minute Tom and Jerry cartoons with strong temporal and spatial…

160

933

683

206.0K

Gashon Hussein Retweeted

Simo Ryu@cloneofsimo · Jul 4

n-simplex attention makes incredible sense because of its honesty: it literally says you can put more compute on attention operation to get more gains: we've seen this trend so many times. This differs from lot of 'suspicious' claim, such as you can use less compute to perform…

524

308

47.0K

Gashon Hussein Retweeted

Physical Intelligence@physical_int · Jun 9

Our models need to run in real time on real robots, but inference with big VLAs takes a long time. We developed Real-Time Action Chunking (RTC) to enable real-time inference with flow matching for the π0 and π0.5 VLAs! More in the thread👇

668

245

60.0K

Gashon Hussein Retweeted

Sergey Levine@svlevine · May 28

Fun project at PI: knowledge insulation for VLAs. We figured out how to train VLAs with cont. actions much more effectively by insulating the VLM and training it with discrete actions, while action expert learns on top. 5-7x faster, and importantly way better language following…

471

324

35.0K

Gashon Hussein Retweeted

Sunflower Capital@seedtosunflower · Apr 24

We’re excited to announce Sunflower Capital Funds I and II. Sunflower is a $250m fund that partners at the earliest stage with companies building foundations for modern enterprises, critical industries, and the physical world.

127

142

622

92.0K

Gashon Hussein Retweeted

Physical Intelligence@physical_int · Apr 22

We got a robot to clean up homes that were never seen in its training data! Our new model, π-0.5, aims to tackle open-world generalization. We took our robot into homes that were not in the training data and asked it to clean kitchens and bedrooms. More below⤵️

260

2.0K

469

456.0K

Gashon Hussein@GashonHussein · Apr 21

introducing chipmunk—a training-free algorithm making ai video generation 3.7x & image gen 1.6x faster! ⚡️ our kernels for column-sparse attention are 9.3x faster than FlashAttention-3 and column-sparse GEMM is 2.5x faster vs. cuBLAS a thread on the GPU kernel optimizations 🧵

TTogether AI@togethercompute · Apr 21

Our latest joint work w/ SandyResearch @ UCSD: training-free acceleration of Diffusion Transformers w/ dynamic sparsity, led by @austinsilveria @SohamGovande! ⚡️ 3.7x faster video and 1.6x faster image generation while preserving quality! 🧵 Open-source code & CUDA kernels!

191

35.0K

Gashon Hussein Retweeted

Jerry Zhou@jzhou891 · Apr 13

I built Orchestrator with @jameszhou02 , a proof of concept for how we envision the future of software engineering. In the future, every engineer will manage swarms of AI engineers that execute their plans in parallel. Orchestrator takes an input prompt and creates a plan that…

2.0K

Gashon Hussein@GashonHussein · Apr 8

One of the neat side effects of initializing from a pre-trained Transformer is that we can generate videos of locations that weren’t in the original Tom and Jerry cartoons. “Around the World” - A 30-second video from earlier in training.

KKaran Dalal@karansdalal · Apr 7

Today, we're releasing a new paper – One-Minute Video Generation with Test-Time Training. We add TTT layers to a pre-trained Transformer and fine-tune it to generate one-minute Tom and Jerry cartoons with strong temporal consistency. Every video below is produced directly by…

271

34.0K

Gashon Hussein@GashonHussein · Apr 8

AI (using TTT) now creates one minute long videos with one prompt! Researchers have developed a method that can be used to create one-minute videos with particularly fluid movements and high temporal consistency. To do this, they use test-time training (TTT) and integrate…

GGashon Hussein@GashonHussein · Apr 7

483

156

55.0K

Gashon Hussein Retweeted

Xiaolong Wang@xiaolonw · Apr 7

Test-Time Training (TTT) is now on Video! And not just a 5-second video. We can generate a full 1-min video! TTT module is an RNN module that provides an explicit and efficient memory mechanism. It models the hidden state of an RNN with a machine learning model, which is updated…

177

1.0K

742

184.0K

Gashon Hussein@GashonHussein · Dec 6

💙Neo

AAli Partovi@apartovi · Dec 6

We loved hosting @SamA for an intimate gathering with @Neo Scholars. Thank you @Alfred_Lin for offering your home! ❤️

1.0K

Gashon Hussein@GashonHussein · Nov 25

Cool to see modern swe-agents taking systems-oriented approaches to reducing large search spaces with different fault localization strategies. Extremely large state+action spaces seemed to be the greatest choke point on the critical path half a year ago

711

Gashon Hussein@GashonHussein · Nov 22

Feel like the traditional approach was to build out the fundamentals of your business by ignoring big launches, iterating through assumptions, and testing your way to pmf. Approach feels outdated in the current landscape.

672