Casey Chu

@caseychu9

Researcher at @openai

San Francisco, CA

Joined August 2017

676Following

4KFollowers

Pinned

Casey Chu@caseychu9 · Jul 17

We launched ChatGPT Agent today! When tested on a variety of REAL work tasks (expert tasks that might take >10h), we found that its output was human-quality almost 50% of the time Agent puts o3's intelligence into practice - try your work tasks and let us know how it goes!

OOpenAI@OpenAI · Jul 17

ChatGPT can now do work for you using its own computer. Introducing ChatGPT agent—a unified agentic system combining Operator’s action-taking remote browser, deep research’s web synthesis, and ChatGPT’s conversational strengths.

141

13.0K

Pinned

Casey Chu Retweeted

Evan Morikawa@E0M · Sep 4

An intuition for relative memory access times (scaled 10^10): Reg: 2 sec - Take from shelf Cache: 6½ min - Get from garage DDR Main: 20 min - Go to store DDR CXL: 1hr Far Mem: 8hr SSD: 6 days - Order online Spinning Disk (3ms): 1yr! Via @dylan522p & @SemiAnalysis_

3.0K

Casey Chu Retweeted

Jerry Tworek@MillionInt · Jul 19

To summarize this week: - we released general purpose computer using agent - got beaten by a single human in atcoder heuristics competition - solved 5/6 new IMO problems with natural language proofs All of those are based on the same single reinforcement learning system

116

1.0K

251

155.0K

Casey Chu Retweeted

Sam Altman@sama · Jul 17

watching chatgpt agent use a computer to do complex tasks has been a real "feel the agi" moment for me; something about seeing the computer think, plan, and execute hits different.

1.0K

860

13.0K

1.0K

4.1M

Casey Chu@caseychu9 · Jul 17

working on bringing that pass@16 number down to pass@1 💪

EEpoch AI@EpochAIResearch · Jul 17

We also found that, when allowed 16 tries per problem, ChatGPT agent’s score grew from 27% to 49% on the tier 1-3 set. This suggests that better prompting or scaffolding might result in better performance from current models.

3.0K

Casey Chu@caseychu9 · Jul 17

Great post from @xikun_zhang_, who did a great job making sure collaboration with Agent feels good!

XXikun Zhang 张熙堃@xikun_zhang_ · Jul 17

Just launched ChatGPT Agent (sorry GPT-5 waiters, it is coming!), the most capable AI agent model to date! It has been such an honor to be part of a crazy sprint to get this amazing model trained and shipped together with an absolutely gem team (@isafulf , @caseychu9 ,…

858

Casey Chu Retweeted

OpenAI@OpenAI · Jul 16

682

777

8.0K

971

2.8M

Casey Chu@caseychu9 · Jun 6

Join us in making the next generation of agents both capable and safe! We think that agents will be a big part of how we interact with AI in the future, making it critical that we think carefully about how we build them.

ffouad@fouadmatin · Jun 5

We're hiring for a new team @OpenAI: Agent Robustness and Control Our goal is to make sure our agents safe and secure during training and deployment. Want to work on some of the hardest problems in AI today? Apply via link in reply or DM me!

3.0K

Casey Chu Retweeted

Noam Brown@polynoamial · Apr 25

It's deeply concerning that one of the best AI researchers I've worked with, @kaicathyc, was denied a U.S. green card today. A Canadian who's lived and contributed here for 12 years now has to leave. We’re risking America’s AI leadership when we turn away talent like this.

420

762

9.0K

1.0K

2.5M

Casey Chu@caseychu9 · Apr 14

been waiting years for solomonoff maximalism to become a populist position. god bless

FFan Donald J. Trump Posts From Truth Social@TrumpDailyPosts · Apr 14

THE BEST DEFINITION OF INTELLIGENCE IS THE ABILITY TO PREDICT THE FUTURE!!! From Donald Trump Truth Social 04/14/25 09:32 AM

7.0K

Casey Chu Retweeted

David Duvenaud@DavidDuvenaud · Feb 27

LLMs have complex joint beliefs about all sorts of quantities. And my postdoc @jamesrequeima visualized them! In this thread we show LLM predictive distributions conditioned on data and free-form text. LLMs pick up on all kinds of subtle and unusual structure: 🧵

203

2.0K

1.0K

192.0K

Casey Chu@caseychu9 · Jan 23

We launched a research preview of Operator today! It's a model built on top of GPT-4o that can control a browser — it is very early and will make mistakes, but it's a taste of things to come openai.com/index/introduc…

207

28.0K

Casey Chu Retweeted

François Chollet@fchollet · Dec 20

Today OpenAI announced o3, its next-gen reasoning model. We've worked with OpenAI to test it on ARC-AGI, and we believe it represents a significant breakthrough in getting AI to adapt to novel tasks. It scores 75.7% on the semi-private eval in low-compute mode (for $20 per task…

202

2.0K

9.0K

3.0K

2.2M

Casey Chu Retweeted

Alex Alemi@alemi · Oct 30

Why don't we measure probabilities in degrees? blog.alexalemi.com/a-degree-of-ce…

6.0K

Casey Chu Retweeted

Christina Farhat@farhatchristina · Sep 4

#NYFW FW24 @nvidia 🔋✨@tessybarton

232

1.0K

15.0K

2.0K

1.6M

Casey Chu Retweeted

Grant Sanderson@3blue1brown · May 18, 2024

I had the joy and the honor of being invited to give the @harveymudd commencement address this year. In the vector space of all advice, I explore a 5-dimension subspace orthogonal to the “follow your dreams” vector. YouTube Link: youtu.be/W3I3kAg2J7w

110

1.0K

291

122.0K

Casey Chu@caseychu9 · May 15, 2024

GPT-4o would not have happened without the vision, talent, conviction, and determination of @prafdhar over a long period of time. that (along with the work of many others) led to what i hope will turn out to be a revolution in how we use computers.

PPrafulla Dhariwal@prafdhar · May 15, 2024

GPT-4o (o for “omni”) is the first model to come out of the omni team, OpenAI’s first natively fully multimodal model. This launch was a huge org-wide effort, but I’d like to give a shout out to a few of my awesome team members who made this magical model even possible!

306

553

8.0K

793

2.3M

Casey Chu@caseychu9 · May 13, 2024

justice for @barret_zoph 🪵

DDaniel@growing_daniel · May 13, 2024

SHE CALLED HIM A WOODEN SURFACE 😭

18.0K

Casey Chu@caseychu9 · May 7, 2024

love this syntax!

RReiner Pope@reinerpope · May 7, 2024

For this, we developed a new library to express sharding more clearly. Here’s multihost FSDP and tensor parallelism (TP) for a feedforward network. “F/t” means both “F/t is the size per chip” and “tensor dimension F is sharded over t chips”. d is FSDP, t is TP.

2.0K