sachit gaudi

@gaudi_sachit

Research: Generalisation, Diffusion models. Grad student @michiganstateu. IIT Guwahati Alum.

Joined January 2015

471Following

65Followers

sachit gaudi@gaudi_sachit · Jul 14

Wow! this is so beautiful and well written. Addicted to this homework!

PPercy Liang@percyliang · Jun 18

Assignment 1 (get basic pipeline working): implement BPE tokenizer, Transformer architecture, Adam optimizer, train models on TinyStories and OpenWebText. Only PyTorch primitives are allowed (can’t just call torch.nn.Transformer or even torch.nn.Linear). github.com/stanford-cs336…

sachit gaudi@gaudi_sachit · Jul 9

Trustworthy representations must be robust against distribution shifts. @gautamsree_ shows that you must explicitly use your interventional causal knowledge to learn robust representations, instead of simply augmenting your dataset with interventional data.

GGautam Sreekumar@gautamsree_ · Jul 9

Representations learned using standard ERM often fail to generalize under interventional distribution shifts because they ignore the causal structure revealed by interventions. Here's how to learn robust representations. 🧵👇(1/9)

228

sachit gaudi Retweeted

AK@_akhaliq · Jul 4

Energy-Based Transformers are Scalable Learners and Thinkers

130

900

714

148.0K

sachit gaudi@gaudi_sachit · Jul 2

Asymmetry of NeurIPS reviews: Authors can submit half-baked work with no penalization, but reviewers are expected to evaluate to a very high standard or face significant penalties.

628

sachit gaudi@gaudi_sachit · Jun 3

first open source merge incoming.

sachit gaudi@gaudi_sachit · May 9

60% of my code is now written by AI.