Atticus Wang

@atticuswzf

MIT 26; To create a little flower is the labour of ages.

Joined August 2022

369Following

113Followers

Atticus Wang@atticuswzf · 9 h

from brain import prefrontal_cortex

234

Atticus Wang@atticuswzf · Jul 22

On IMO P6 (without going into too much detail about our setup), the model "knew" it didn't have a correct solution. The model knowing when it didn't know was one of the early signs of life that made us excited about the underlying research direction!

DDaniel Litt@littmath · Jul 21

One piece of info that seems important to me in terms of forecasting usefulness of new AI models for mathematics: did the gold-medal-winning models, which did not solve IMO problem 6, submit incorrect answers for it? 🧵

162

2.0K

306

254.0K

Atticus Wang Retweeted

Alexander Wei@alexwei_ · Jul 19

1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).

406

1.0K

7.0K

2.0K

5.2M

Atticus Wang Retweeted

Samuel Marks@saprmarks · Jul 13

xAI launched Grok 4 without any documentation of their safety testing. This is reckless and breaks with industry best practices followed by other major AI labs. If xAI is going to be a frontier AI developer, they should act like one. 🧵

273

249

3.0K

827

658.0K

Atticus Wang Retweeted

Anthropic@AnthropicAI · Jun 27

Anthropic staff realized they could ask Claude to buy things that weren’t just food & drink. After someone randomly decided to ask it to order a tungsten cube, Claude ended up with an inventory full of (as it put it) “specialty metal items” that it ended up selling at a loss.

212

4.0K

334

838.0K

Atticus Wang@atticuswzf · Jun 26

Arc browser won't open after installing the new update. Anyone having a similar problem? @browsercompany @BrowserCoStatus

364

Atticus Wang@atticuswzf · Jun 20

fear is the mind killer

300

Atticus Wang@atticuswzf · Jun 19

San Francisco has a blessed climate

499

Atticus Wang@atticuswzf · Jun 15

Around Berkeley doing applied interp at MATS 8.0. HMU!

1.0K

Atticus Wang Retweeted

Morph@morph_labs · Jun 12

We are excited to announce Trinity, an autoformalization system for verified superintelligence that we have developed at @morph_labs. We have used it to automatically formalize in Lean a classical result of de Bruijn that the abc conjecture is true almost always.…

542

246

177.0K

Atticus Wang Retweeted

j⧉nus@repligate · Jun 10

Haiku plays a valuable role in the ecosystem

285

22.0K

Atticus Wang Retweeted

Danielle Fong 🔆@DanielleFong · Jun 5

ai’s referring to their humans as their nonbinary partners

784

28.0K

Atticus Wang Retweeted

Seohong Park@seohong_park · Jun 5

*Horizon reduction* was the only technique we found that substantially improved scaling. Even simple tricks like n-step returns improved *asymptotic* performance (so it's not merely a trick to speed up training!). Full hierarchical RL scaled even better.

3.0K

Atticus Wang Retweeted

Sonia@soniajoseph_ · Jun 4

Our paper Prisma: An Open Source Toolkit for Mechanistic Interpretability in Vision and Video received an Oral at the Mechanistic Interpretability for Vision Workshop at CVPR 2025! 🎉 We’ll be in Nashville next week. Come say hi 👋 @CVPR @miv_cvpr2025

277

101

19.0K

Atticus Wang Retweeted

Uzay @ sf@uzpg_ · Jun 3

@kaivu, @atticuswzf , and I were researching long horizon reasoning (with @jacobandreas). We found existing benchmarks’ hard problems often featured tricky puzzles, not tests of system understanding. So we made Breakpoint: a SWE benchmark designed to disambiguate this capability.

5.0K

Atticus Wang@atticuswzf · Jun 1

Why do I keep getting rejected from accessing Llama models on huggingface?

156

Atticus Wang@atticuswzf · May 30

Terence Tao is selling 5,000 bars of soap that contain drops of his own chalk dust

CComplex Pop Culture@ComplexPop · May 29

Sydney Sweeney is selling 5,000 bars of soap that contain drops of her own bathwater.

113

586

13.0K

466

527.0K

Atticus Wang@atticuswzf · May 29

Sigh, it's a bit of a mess. Let me just give you guys the full nuance in one stream of consciousness since I think we'll continue to get partial interpretations that confuse everyone. All the little things I post need to always be put together in one place. First, I have long…

SShashwat Goel@ShashwatGoel7 · May 29

Confused about recent LLM RL results where models improve without any ground-truth signal? We were too. Until we looked at the reported numbers of the Pre-RL models and realized they were serverely underreported across papers. We compiled discrepancies in a blog below🧵👇

600

437

82.0K

Atticus Wang@atticuswzf · May 29

stop saying "millions" and "billions" and start calling them "megadollars" and "gigadollars"

159

Atticus Wang@atticuswzf · May 27

learn it, love it, live it