Shikai Qiu

@ShikaiQiu

ML PhD student, Scaling | Prev SR @GoogleDeepMind, Physics @UCBerkeley

New York City

Joined April 2022

762Following

530Followers

Pinned

Shikai Qiu@ShikaiQiu · Jul 8

While scaling laws typically predict the final loss, we show in our ICML oral paper that good scaling rules enable accurate predictions of entire loss curves of larger models from smaller ones! w/@Locchiu, @andrewgwils, J. Pennington, A. Agarwala: arxiv.org/abs/2507.02119 1/10

ShikaiQiu's tweet image. While scaling laws typically predict the final loss, we show in our ICML oral paper that good scaling rules enable accurate predictions of entire loss curves of larger models from smaller ones!

w/@Locchiu, @andrewgwils, J. Pennington, A. Agarwala:
arxiv.org/abs/2507.02119
1/10

223

162

24.0K

Shikai Qiu Retweeted

Yucen Lily Li@yucenlily · Jul 10

In our new ICML paper, we show that popular families of OOD detection procedures, such as feature and logit based methods, are fundamentally misspecified, answering a different question than “is this point from a different distribution?” arxiv.org/abs/2507.01831 [1/7]

239

171

46.0K

Shikai Qiu Retweeted

Marc Finzi@m_finzi · Apr 23

Why do larger language models generalize better? In our new ICLR paper, we derive an interpretable generalization bound showing that compute-optimal LLMs provably generalize better with scale! 📄arxiv.org/abs/2504.15208 1/7🧵

129

106

37.0K