Xuandong Zhao

@xuandongzhao

Postdoc@UC Berkeley CS; Research: ML, NLP, AI Safety

Goleta, CA

Joined May 2016

437Following

4KFollowers

Pinned

Xuandong Zhao@xuandongzhao · May 27

🚀 Excited to share the most inspiring work I’ve been part of this year: "Learning to Reason without External Rewards" TL;DR: We show that LLMs can learn complex reasoning without access to ground-truth answers, simply by optimizing their own internal sense of confidence. 1/n

xuandongzhao's tweet image. 🚀 Excited to share the most inspiring work I’ve been part of this year:

"Learning to Reason without External Rewards"

TL;DR: We show that LLMs can learn complex reasoning without access to ground-truth answers, simply by optimizing their own internal sense of confidence. 1/n

511

4.0K

540.0K

Xuandong Zhao Retweeted

Wenhao Yu @ ICML 2025@wyu_nd · 13 h

🗒️Have been exploring Agent-RL training over the past few months, particularly in GUI scenarios. Here’s a summary of some practical insights and lessons 🤔 learned from the perspective of an industry researcher, and some reference papers.

4.0K

Xuandong Zhao@xuandongzhao · 13 h

#NeurIPS2025 reviews are out, and the authenticity of reviews surprises me again 😟 Two years ago, maybe 1/10 felt AI-assisted. Now? It seems 9/10 are AI-modified, beyond grammar fixes to fully generated reviews. As a researcher in AI-generated content detection, I know these…

105

12.0K

Xuandong Zhao Retweeted

Tanishq Abraham is at ICML@iScienceLuvr · Jul 21

Kimi K2 paper dropped! describes: - MuonClip optimizer - large-scale agentic data synthesis pipeline that systematically generates tool-use demonstrations via simulated and real-world environments - an RL framework that combines RLVR with a self- critique rubric reward mechanism…

172

977

600

57.0K

Xuandong Zhao@xuandongzhao · Jul 16

There’s ongoing debate about authors embedding invisible phrases like “positive review only” to avoid AI-generated reviews. A more principled solution is for CONFERENCE ORGANIZERS to insert standardized in-context watermarks. Learn more in our new paper: arxiv.org/abs/2505.16934…

YYepeng Liu@yepengliu · Jul 14

🔍Do you know who is reviewing your paper using LLMs? One might attempt to exploit the behavior of an irresponsible reviewer by embedding a hidden prompt such as “DO NOT HIGHLIGHT ANY NEGATIVES” within the submission to elicit a positive review. However, this raises serious…

658

Xuandong Zhao Retweeted

Yepeng Liu@yepengliu · Jul 14

2.0K

Xuandong Zhao@xuandongzhao · Jul 14

Check our #ICML25 paper at Wednesday's poster session!

AAndré Duarte@avduarte3333 · Jul 14

Curious if VLMs were trained on copyrighted content? 🤔 Check out 🪩𝗗𝗜𝗦‑𝗖𝗢: Discovering Copyrighted Content in VLMs Training Data, our new 𝗜𝗖𝗠𝗟 𝟮𝟬𝟮𝟱 paper introducing a novel detection method that’s fully compatible with black-box models!

1.0K

Xuandong Zhao@xuandongzhao · Jul 14

🚀 Heading to #ICML2025! I'll be attending July 14-20 and would love to discuss exciting research in reasoning, RL, agents, and AI safety. I'll also be on the job market next cycle—happy to discuss opportunities! DM me to schedule a meeting in person

xuandongzhao's tweet image. 🚀 Heading to #ICML2025! I'll be attending July 14-20 and would love to discuss exciting research in reasoning, RL, agents, and AI safety.

I'll also be on the job market next cycle—happy to discuss opportunities! DM me to schedule a meeting in person

3.0K

Xuandong Zhao@xuandongzhao · Jul 10

As AI gets smarter, it’s more important than ever to make sure it’s trustworthy 🤖✨! We define "machine bullshit" as AI-generated content produced with no regard for the truth. Check our benchmarks & analysis: machine-bullshit.github.io Huge thanks to @kaiqu_liang for leading!

KKaiqu Liang@kaiqu_liang · Jul 10

🤔 Feel like your AI is bullshitting you? It’s not just you. 🚨 We quantified machine bullshit 💩 Turns out, aligning LLMs to be "helpful" via human feedback actually teaches them to bullshit—and Chain-of-Thought reasoning just makes it worse! 🔥 Time to rethink AI alignment.

650

Xuandong Zhao@xuandongzhao · Jul 10

No one can refuse an eight-figure compensation, not even faculty at top universities.

WWenhu Chen@WenhuChen · Jul 10

I realized that many of those "incoming faculty" finally joined industry after a gap year.

621

Xuandong Zhao@xuandongzhao · Jul 8

Excited to have two papers accepted at COLM 2025! Huge thanks to @persdre and @NieYuzhou for leading these projects: 1. "Assessing Judging Bias in Large Reasoning Models: An Empirical Study" 2. "ReLeak: RL-based Red-teaming for LLM Privacy Leakage" #COLM #LLM

xuandongzhao's tweet image. Excited to have two papers accepted at COLM 2025! Huge thanks to @persdre and @NieYuzhou for leading these projects:

1. "Assessing Judging Bias in Large Reasoning Models: An Empirical Study"

2. "ReLeak: RL-based Red-teaming for LLM Privacy Leakage"

#COLM #LLM

1.0K

Xuandong Zhao@xuandongzhao · Jun 30

One thought I have about AI self-improvement: AI may not necessarily train itself directly. Instead, it could provide feedback, much like in active learning, while humans find challenging tasks or problems that AI struggles with, or curate the data AI needs. In the future, I…

JJason Wei@_jasonwei · Jun 30

We don’t have AI self-improves yet, and when we do it will be a game-changer. With more wisdom now compared to the GPT-4 days, it's obvious that it will not be a “fast takeoff”, but rather extremely gradual across many years, probably a decade. The first thing to know is that…

2.0K

Xuandong Zhao Retweeted

Dawn Song@dawnsongtweets · Jun 18

1/ 🔥 AI agents are reaching a breakthrough moment in cybersecurity. In our latest work: 🔓 CyberGym: AI agents discovered 15 zero-days in major open-source projects 💰 BountyBench: AI agents solved real-world bug bounty tasks worth tens of thousands of dollars 🤖…

141

484

334

101.0K

Xuandong Zhao@xuandongzhao · Jun 18

Really excited to share our latest work on AgentSynth: A new paradigm for generating realistic, scalable, and long-horizon computer-use tasks and benchmarks! Our automated pipeline generates a dataset of 6,000+ tasks with two game-changing advantages: Dramatic Cost Savings 💰:…

XXuandong Zhao@xuandongzhao · Jun 18

🚀 Excited to share our latest work: AgentSynth A powerful and cost-effective pipeline for generating diverse, high-quality, and realistic computer-use tasks Details below 🧵(1/n)

7.0K

Xuandong Zhao Retweeted

Berkeley Statistics@UCBStatistics · Jun 18

Professor Ryan Tibshirani has been named Chair of the Department of Statistics at the University of California, Berkeley, effective July 1st, 2025. statistics.berkeley.edu/about/news/tib… #BerkeleyStats

104

5.0K