Hu Xu (@ICML25)

@Hu_Hsu

FAIR, Data Research, end-end data model and training, MetaCLIP (scaling CLIP from scratch), DINOv2, Llama

Bellevue, WA

Joined October 2014

627Following

535Followers

Pinned

Hu Xu (@ICML25)@Hu_Hsu · Jan 20, 2024

🎉 MetaCLIP accepted as #ICLR2024 spotlight. Big thxs/cong. to @sainingxie @ellenxtan0 @berniebear_1220 @RussellHowes1 @sharmavasu55 @ShangwenLi1 @gargighosh @LukeZettlemoyer @cfeichtenhofer and @_akhaliq post. Code github.com/facebookresear… w/ ViT-G(IN 82.1)/demo/metadata builder.

Hu_Hsu's tweet image. 🎉 MetaCLIP accepted as #ICLR2024 spotlight. Big thxs/cong. to @sainingxie @ellenxtan0 @berniebear_1220 @RussellHowes1 @sharmavasu55 @ShangwenLi1 @gargighosh @LukeZettlemoyer @cfeichtenhofer and @_akhaliq post. Code github.com/facebookresear… w/ ViT-G(IN 82.1)/demo/metadata builder.

100

15.0K

Pinned

Hu Xu (@ICML25)@Hu_Hsu · Dec 13

Excited Meta CLIP 1.2 is in this release during #NeurIPS2024, now we have a synthetic engine re-aligning hard alt-texts as hard and dense captions.

JJoelle Pineau@jpineau1 · Dec 13

It's been a big year for AI, and today I'm excited to share nine new open source releases from Meta FAIR to wrap up the year — all part of our mission to achieve advanced machine intelligence (AMI). So proud of this team and looking forward to 2025! ai.meta.com/blog/meta-fair…

2.0K

Pinned

Hu Xu (@ICML25) Retweeted

Joelle Pineau@jpineau1 · Dec 13

151

13.0K

Hu Xu (@ICML25) Retweeted

Pedro Domingos@pmddomingos · Jul 19

If you want a job at Meta, get hired by OpenAI.

226

14.0K

Hu Xu (@ICML25)@Hu_Hsu · Jul 19

Thanks for the invited talk and happy to share our industrial insights on “scaling data alignment” from Meta CLIP (its wide adoption and what’s next) in the DataWorld workshop #ICML2025 . happy to chat offline about data research.

TThao Nguyen@thao_nguyen26 · Jul 17

If you are attending #ICML2025, check out our DataWorld workshop on Sat July 19. We have updated the website with more info on speakers & accepted papers! dataworldicml2025.github.io Also happy to chat offline about all things ✨ data ✨

971

Hu Xu (@ICML25)@Hu_Hsu · Jul 13

Heading to #ICML2025 (first time). Excited to meet need friends and old friends and chat about foundational data research and co-design with training (MetaCLIP), SelfCite arxiv.org/abs/2502.09604 with @YungSungChuang and LongVU arxiv.org/abs/2410.17434 with @xiaoqian_shen .

1.0K

Hu Xu (@ICML25)@Hu_Hsu · May 9

A new start of FAIR and excited to be part of it.

YYann LeCun@ylecun · May 8

Rob Fergus is the new head of Meta-FAIR! FAIR is refocusing on Advanced Machine Intelligence: what others would call human-level AI or AGI. linkedin.com/posts/rob-ferg…

620

Hu Xu (@ICML25)@Hu_Hsu · May 8

Congrats @rob_fergus ! Big win for FAIR

RRob Fergus@rob_fergus · May 8

1/ Excited to share that I’m taking on the role of leading Fundamental AI Research (FAIR) at Meta. Huge thanks to Joelle for everything. Look forward to working closely again with Yann & team.

7.0K

Hu Xu (@ICML25)@Hu_Hsu · Apr 20

Great to see MetaCLIP algorithm (arxiv.org/abs/2309.16671) desaturate SSL training distribution as SSL 2.0. What’s next in SSL or pre-training? From our data research perspective, it’s likely about how to automatically desaturate a training distribution.

YYann LeCun@ylecun · Apr 3

New paper from FAIR+NYU: Q: Is language supervision required to learn effective visual representations for multimodal tasks? A: No. ⬇️⬇️⬇️

2.0K

Hu Xu (@ICML25)@Hu_Hsu · Apr 20

excited to see the next level of generalization: from human’s (as one type of agent) experience to agent’s experience. TBH, experience is a better name of ‘data’, and next x prediction is learning from past experience.

RRichard Sutton@RichardSSutton · Apr 11

David Silver really hits it out of the park in this podcast. The paper "Welcome to the Era of Experience" is here: goo.gle/3EiRKIH.

255

Hu Xu (@ICML25)@Hu_Hsu · Apr 6

get information dense image training distribution is the key to decide what to keep or compress in representation; it’s probably time to rethinking what is supervision, no matter if it’s directly in loss or clear annotation process.

YYann LeCun@ylecun · Apr 3

New paper from FAIR+NYU: Q: Is language supervision required to learn effective visual representations for multimodal tasks? A: No. ⬇️⬇️⬇️

167

Hu Xu (@ICML25)@Hu_Hsu · Apr 6

congrats the release with many surprises features.

AAI at Meta@AIatMeta · Apr 5

Today is the start of a new era of natively multimodal AI innovation. Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model…

131

Hu Xu (@ICML25)@Hu_Hsu · Dec 28

It’s also true if we break down one project: it’s extremely hard to execute step 1 to n-1 or define what is step 1; it’s easy to execute step n (by another party).

SSam Altman@sama · Dec 27

it is (relatively) easy to copy something that you know works. it is extremely hard to do something new, risky, and difficult when you don't know if it will work. individual researchers rightly get a lot of glory for that when they do it! it's the coolest thing in the world.

315

Hu Xu (@ICML25) Retweeted

Sam Altman@sama · Dec 27

583

1.0K

15.0K

2.0K

1.8M

Hu Xu (@ICML25)@Hu_Hsu · Dec 26

The slide from @ilyasut at #neurips2024 seems mean, the so-called scaling on data is never scalable in theory, because the process of creating human/agent supervision is not scalable. Excited to see his future research to break this barrier.

Hu_Hsu's tweet image. The slide from @ilyasut at #neurips2024 seems mean, the so-called scaling on data is never scalable in theory, because the process of creating human/agent supervision is not scalable. Excited to see his future research to break this barrier.

320

Hu Xu (@ICML25)@Hu_Hsu · Dec 25

This slide reminds me to build definitions first: pre-training means scaling on both model and data. One Internet means the data growth is not scalable(quality data that made blind scaling works even grow at sub constant rate). Then what to scale if not pre-training data.

DDhruv Batra@DhruvBatraDB · Dec 14

Brilliant talk by @ilyasut, but he's wrong on one point. We are NOT running out of data. We are running out human-written text. We have more videos than we know what to do with. We just haven't solved pre-training in vision. Just go out and sense the world. Data is easy.

352

Hu Xu (@ICML25)@Hu_Hsu · Nov 12

Due to urgent family issue, will miss #EMNLP2024 but have amazing @ellenxtan0 present Altogether at 4pm on Tuesday, Session 04, join if you are interested in the next generation of synthetic data and image captioning.

HHu Xu (@ICML25)@Hu_Hsu · Nov 9

(1/5) 🎉[New Paper] Altogether: Image Captioning via Re-aligning Alt-text arxiv.org/abs/2410.17251 is accepted by #EMNLP2024 : we re-align existing alt-texts as captions for images instead of captioning images from scratch (e.g. MS-COCO captioning).

290