Jifan Zhang

@jifan_zhang

Research Fellow @AnthropicAI | Ph.D. @WisconsinCS @WIDiscovery | Previously BS/MS @uwcse, @Meta @Google @Amazon

Joined April 2017

439Following

315Followers

Jifan Zhang Retweeted

Accepted papers at TMLR@TmlrPub · Apr 22

Deep Active Learning in the Open World Tian Xie, Jifan Zhang, Haoyue Bai, Robert D Nowak. Action editor: Vincent Fortuin. openreview.net/forum?id=Hkmym… #classification #annotation #classes

334

Jifan Zhang@jifan_zhang · Jul 23

When are we going to get AI agents managing experiments for us? Running large scale experiments always mess up my sleep🥲

Jifan Zhang@jifan_zhang · Jul 22

Very cool result! I would have not predicted that when the model inits are the same, distillation transmits so much hidden information about the teacher. (This is much more powerful than emergent-misalignment-like phenomenon!)

OOwain Evans@OwainEvans_UK · Jul 22

New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵

1.0K

Jifan Zhang@jifan_zhang · Jul 22

Subliminal learning: training on model-generated data can transmit traits of that model, even if the data is unrelated. Think: "You can learn physics by watching Einstein do yoga" I'll discuss how this introduces a surprising pitfall for AI developers 🧵x.com/OwainEvans_UK/…

OOwain Evans@OwainEvans_UK · Jul 22

New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵

163

21.0K

Jifan Zhang Retweeted

Lin Yang@lyang36 · Jul 22

🚨 Olympiad math + AI: We ran Google’s Gemini 2.5 Pro on the fresh IMO 2025 problems. With careful prompting and pipeline design, it solved 5 out of 6 — remarkable for tasks demanding deep insight and creativity. The model could win gold! 🥇 #AI #Math #LLMs #IMO2025

118

1.0K

470

325.0K

Jifan Zhang@jifan_zhang · Jul 21

Congrats Lalit and the GDM team for winning🏅!

llalit@stochasticlalit · Jul 21

It was amazing to be part of this effort. Huge shout out to the team, and all the incredible pre-training and post-training efforts that ensure Gemini is the leading frontier model! deepmind.google/discover/blog/…

Jifan Zhang Retweeted

lalit@stochasticlalit · Jul 21

3.0K

Jifan Zhang@jifan_zhang · Jul 19

How far are LLMs away from making an entire set of IMO problems?

146

Jifan Zhang Retweeted

Andrew Wagenmaker@ajwagenmaker · Jul 16

How can we train a foundation model to internalize what it means to “explore”? Come check out our work on “behavioral exploration” at ICML25 to find out!

380

291

43.0K

Jifan Zhang@jifan_zhang · Jul 14

Looking forward to seeing folks tomorrow afternoon!

JJifan Zhang@jifan_zhang · Jul 14

Working in active learning, I often hear a common complaint — active learning doesn’t work sometimes. Our ICML 2025 paper JUST WORKS* when you have class imbalance. @rdnowak will present it Tuesday afternoon. Short 🧵below Paper: arxiv.org/abs/2312.09196

1.0K

Jifan Zhang Retweeted

Samuel Marks@saprmarks · Jul 13

xAI launched Grok 4 without any documentation of their safety testing. This is reckless and breaks with industry best practices followed by other major AI labs. If xAI is going to be a frontier AI developer, they should act like one. 🧵

273

251

3.0K

826

658.0K

Jifan Zhang@jifan_zhang · Jul 11

Releasing HumorBench today. Grok 4 is🥇 on this uncontaminated, non-STEM humor reasoning benchmark. 🫡🫡@xai Here are couple things I find surprising👇 1. this benchmark yields an almost perfect rank correlation with ARC-AGI. Yet the task of reasoning about New Yorker style…

RReuben Narad@ReubenNarad · Jul 11

Whoa... Grok 4 beats o3 on our never-released benchmark: HumorBench, a non-STEM reasoning benchmark that measures humor comprehension. The task is simple: given a New Yorker Caption Contest cartoon and caption, explain the joke.

2.0K