Yiding Jiang

@yidingjiang

PhD student @mldcmu @SCSatCMU. Formerly intern @MetaAI, AI resident @GoogleAI. BS from @Berkeley_EECS. Trying to understand stuff.

Joined December 2015

601Following

2KFollowers

Pinned

Yiding Jiang@yidingjiang · Oct 21

Selecting good pretraining data is crucial, but rarely economical. Introducing ADO, an online solution to data selection with minimal overhead. 🧵 1/n

yidingjiang's tweet image. Selecting good pretraining data is crucial, but rarely economical.

Introducing ADO, an online solution to data selection with minimal overhead.

🧵 1/n

346

291

72.0K

Yiding Jiang Retweeted

Vaishnavh Nagarajan@_vaishnavh · Jul 16

Today @ChenHenryWu and I will be presenting our #ICML work on creativity in the Oral 3A Reasoning session (West Exhibition Hall C) 10 - 11 am PT Or please stop by our poster right after @ East Exhibition Hall A-B #E-2505 11am-1:30pm. (Hope you enjoy some silly human drawings!)

6.0K

Yiding Jiang Retweeted

Alex Robey@AlexRobey23 · Jul 14

On Monday, I'll be presenting a tutorial on jailbreaking LLMs + the security of AI agents with @HamedSHassani and @aminkarbasi at ICML. I'll be in Vancouver all week -- send me a DM if you'd like to chat about jailbreaking, AI agents, robots, distillation, or anything else!

7.0K

Yiding Jiang Retweeted

Aya Somai@aya_somai_ · Jul 10

My favorite reading of the week by @yidingjiang: Next era is not about learning from data but deciding what data to learn from. yidingjiang.github.io/blog/post/expl…

1.0K

Yiding Jiang Retweeted

Jean de Nyandwi@Jeande_d · Jul 7

Good blog on "era of exploration" - Data scarcity is the new bottleneck. LLMs consume data far faster than humans can produce it. We're running out of high-quality training data. - Pretraining solved exploration by accident. Pretraining effectively pays a massive, upfront…

286

283

22.0K

Yiding Jiang Retweeted

Minqi Jiang@MinqiJiang · Jun 30

Recently, there has been a lot of talk of LLM agents automating ML research itself. If Llama 5 can create Llama 6, then surely the singularity is just around the corner. How can we get a pulse check on whether current LLMs are capable of driving this kind of total…

194

1.0K

784

529.0K

Yiding Jiang@yidingjiang · Jun 26

A mental model I find useful: all data acquisition (web scrapes, synthetic data, RL rollouts, etc.) is really an exploration problem 🔍. This perspective has some interesting implications for where AI is heading. Wrote down some thoughts: yidingjiang.github.io/blog/post/expl…

429

397

36.0K

Yiding Jiang@yidingjiang · May 20

Prequential coding is such a lovely lens for thinking about curriculum learning.

YYiding Jiang@yidingjiang · May 20

Data selection and curriculum learning can be formally viewed as a compression protocol via prequential coding. New blog (with @AllanZhou17 ) about this neat idea that motivated ADO but didn’t make it into the paper. yidingjiang.github.io/blog/post/curr…

3.0K

Yiding Jiang@yidingjiang · May 20

How should we order training examples? In a new blogpost (w/ @yidingjiang), we explore a compression-based perspective: order your dataset to minimize its prequential codelength.

YYiding Jiang@yidingjiang · May 20

3.0K

Yiding Jiang@yidingjiang · May 20

103

13.0K

Yiding Jiang Retweeted

Yutong (Kelly) He@electronickale · Apr 28

✨ Love 4o-style image generation but prefer to use Midjourney? Tired of manual prompt crafting from inspo images? PRISM to the rescue! 🖼️→📝→🖼️ We automate black-box prompt engineering—no training, no embeddings, just accurate, readable prompts from your inspo images! 1/🧵

20.0K

Yiding Jiang Retweeted

�

𝚐𝔪𝟾𝚡𝚡𝟾@gm8xx8 · Apr 16

Looking beyond the next token TRELAWNEY inserts future tokens <T>...</T> during training to teach models to plan ahead—boosting reasoning, coherence, and control. Highlights: - NO ARCHITECTURE CHANGES. JUST SMARTER DATA. - works with standard decoding - enables controllable…

292

204

23.0K

Yiding Jiang@yidingjiang · Apr 16

Excited to be presenting ADO next week at #ICLR2025! Check out a new blogpost we wrote that summarizes the key ideas and results (link below):

YYiding Jiang@yidingjiang · Oct 21

Selecting good pretraining data is crucial, but rarely economical. Introducing ADO, an online solution to data selection with minimal overhead. 🧵 1/n

3.0K

Yiding Jiang@yidingjiang · Apr 17

Check out our online data selection alg ADO at ICLR 2025! And take a look at this blog post by @yidingjiang and @AllanZhou17 summarizing the key ideas: bland.website/notes/ado/

YYiding Jiang@yidingjiang · Oct 21

Selecting good pretraining data is crucial, but rarely economical. Introducing ADO, an online solution to data selection with minimal overhead. 🧵 1/n

4.0K

Yiding Jiang Retweeted

Christina Baek@_christinabaek · Apr 16

Are current reasoning models optimal for test-time scaling? 🌠 No! Models make the same incorrect guess over and over again. We show that you can fix this problem w/o any crazy tricks 💫 – just do weight ensembling (WiSE-FT) for big gains on math! 1/N

103

484

327

53.0K