Saining Xie

@sainingxie

researcher in #deeplearning #computervision | assistant prof at @nyu_courant | rs @googledeepmind | past: rs @meta (FAIR) @ucsandiego

New York

Joined July 2020

1KFollowing

22KFollowers

Pinned

Saining Xie Retweeted

Jonathan Lorraine@jonLorraine9 · Jul 7

As the original poster of this strategy (sorry!), I agree that it is a bit unethical in reviews, but that is so overblown here. I do not advocate doing this strategy, but... My main issues are: Safety: Be cautious when taking large blocks of text, dropping them into an LLM, and…

6.0K

Saining Xie Retweeted

#ICCV2025@ICCVConference · Jul 20

#ICCV2025 is deeply committed to promoting diversity, equity, and inclusion within our community. As part of this commitment, travel support is available to help broaden participation. Applications will be reviewed on a rolling basis until August 20, 2025 (anywhere on Earth).

8.0K

Saining Xie@sainingxie · Jul 12

yes

vvik@vikhyatk · Jul 10

i mostly use my visual intelligence when trying to solve this sota approaches to arc agi are mostly symbolic, vision doesn't really work well with today's models ergo this is really because we haven't really solved visual reasoning AI

7.0K

Saining Xie@sainingxie · Jul 10

The three biggest hps for stable training in everything are lr, bs, and beta2. We’ve built up good intuitions on how to tune them over time, but this lays it all out analytically and convincingly. this is definitely my new handbook for training big models on small gpus.

MMicah Goldblum@micahgoldblum · Jul 10

🚨 Did you know that small-batch vanilla SGD without momentum (i.e. the first optimizer you learn about in intro ML) is virtually as fast as AdamW for LLM pretraining on a per-FLOP basis? 📜 1/n

205

104

20.0K

Saining Xie Retweeted

Albert Gu@_albertgu · Jul 8

I converted one of my favorite talks I've given over the past year into a blog post. "On the Tradeoffs of SSMs and Transformers" (or: tokens are bullshit) In a few days, we'll release what I believe is the next major advance for architectures.

114

784

540

113.0K

Saining Xie@sainingxie · Jul 7

internet at its peak--just look at how people roasted him in quotes/comments 6 months ago. Again, I think this is very wrong, but can you really blame the students if the community was encouraging this idea, and then suddenly next day they’re being treated like the big villain?

AAshwinee Panda@PandaAshwinee · Nov 18

DO NOT DO THIS. I have previously raised this for Ethics Review when I saw it in a paper. You are not sneaky.

37.0K

Saining Xie Retweeted

Alexi Gladstone@AlexiGlad · Jul 7

How can we unlock generalized reasoning? ⚡️Introducing Energy-Based Transformers (EBTs), an approach that out-scales (feed-forward) transformers and unlocks generalized reasoning/thinking on any modality/problem without rewards. TLDR: - EBTs are the first model to outscale the…

250

2.0K

301.0K

Saining Xie@sainingxie · Jul 7

Thanks for bringing this to my attention. I honestly wasn’t aware of the situation until the recent posts started going viral. I would never encourage my students to do anything like this—if I were serving as an Area Chair, any paper with this kind of prompt would be…

SSaining Xie@sainingxie · Jul 7

20.0K

Saining Xie@sainingxie · Jul 5

This looks like a great deep dive on neural network architectures for diffusion models. tl;dr use a Transformer, but there's quite a bit more to it, and as always in this field, the devil is in the details!

SSayak Paul@RisingSayak · Jul 4

Had the honor to present diffusion transformers at CS25, Stanford. The place is truly magical. Slides: bit.ly/dit-cs25 Recording: youtu.be/vXtapCFctTI?si… Thanks to @stevenyfeng for making it happen!

160

136

18.0K

Saining Xie Retweeted

Manling Li@ManlingLi_ · Jun 30

Can VLMs build Spatial Mental Models like humans? Reasoning from limited views? Reasoning from partial observations? Reasoning about unseen objects behind furniture / beyond current view? Check out MindCube! 🌐mll-lab-nu.github.io/mind-cube/ 📰arxiv.org/pdf/2506.21458…

281

233

38.0K

Saining Xie@sainingxie · Jun 30

awesome work by @jiacheng_chen_ and @sanghyunwoo1219 on 3D-grounded visual compositing (and nice demos!)

SSanghyun Woo@sanghyunwoo1219 · Jun 30

Introducing BlenderFusion: Reassemble your visual elements—objects, camera, and background—to compose a new visual narrative. Play the interactive demo: blenderfusion.github.io

9.0K

Saining Xie@sainingxie · Jun 27

metaquery is now open-source — with both the data and code available.

XXichen Pan@xichen_pan · Jun 27

The code and instruction-tuning data for MetaQuery are now open-sourced! Code: github.com/facebookresear… Data: huggingface.co/collections/xc… Two months ago, we released MetaQuery, a minimal training recipe for SOTA unified understanding and generation models. We showed that tuning few…

9.0K

Saining Xie Retweeted

Andrej Karpathy@karpathy · Jun 27

Do people *feel* how much work there is still to do. Like wow.

100

3.0K

195

167.0K

Saining Xie Retweeted

Tal Linzen@tallinzen · Jun 21

I'm hiring at least one post-doc! We're interested in creating language models that process language more like humans than mainstream LLMs do, through architectural modifications and interpretability-style steering.

282

115

44.0K

Saining Xie@sainingxie · Jun 20

guys, real geospatial data is a total goldmine for digital agents. step away from the web browser and get real. (we explored a bit in virl-platform.github.io, but building a simulation-ready pipeline like this could take things way further)

CChuang Gan@gan_chuang · Jun 20

Virtual Community provides an online pipeline that automatically generates 3D scenes from real geospatial data, performing comprehensive cleaning and enhancement of both geometry and texture — including mesh simplification, texture refinement, object placement, and automatic…

103

20.0K

Saining Xie@sainingxie · Jun 18

wait, speaking of false dichotomies---during your phd, you *can* write code, dive into data and systems, collaborate with a team, and build useful things---all while enjoying complete openness and the freedom to pursue what *genuinely* excites you.

SSuvansh Sanjeev@SuvanshSanjeev · Jun 17

i left my phd before joining openai working in industry demands more rigor – you don’t just need to convince reviewer 2 with a nice graph and an ego-cite, it better actually work if it’s underwriting billions in research investment not saying it always pans out that way in…

298

48.0K

Saining Xie Retweeted

Mathurin Massias@mathusmassias · Jun 18

New paper on the generalization of Flow Matching arxiv.org/abs/2506.03719 🤯 Why does flow matching generalize? Did you know that the flow matching target you're trying to learn **can only generate training points**? with @Qu3ntinB, Anne Gagneux & Rémi Emonet 👇👇👇

232

1.0K

103.0K

Saining Xie Retweeted

Benjamin Feuer@FeuerBenjamin · Jun 18

So excited to announce the DCVLR (Data Curation for Vision-Language Reasoning) competition at NeurIPS 2025, led by @Oumi_PBC and sponsored by @LambdaAPI! 🌟open-data 🌟 🤖 open-models 🤖 💻 open-source 💻 💪anyone can compete for free 💪 dcvlr-neurips.github.io 🧵 1 / n

7.0K

Saining Xie Retweeted

Rohan Paul@rohanpaul_ai · Jun 16

This is really BAD news of LLM's coding skill. ☹️ The best Frontier LLM models achieve 0% on hard real-life Programming Contest problems, domains where expert humans still excel. LiveCodeBench Pro, a benchmark composed of problems from Codeforces, ICPC, and IOI (“International…

100

317

2.0K

456.0K