Mike Knoop

@mikeknoop

co-founder @ndea and @zapier @arcprize

sf bay area

Joined July 2009

339Following

22KFollowers

Pinned

Mike Knoop@mikeknoop · Jul 18

Today we’re releasing our first public preview of ARC-AGI-3: the first three games. Version 3 is a big upgrade over v1 and v2 which are designed to challenge pure deep learning and static reasoning. In contrast, v3 challenges interactive reasoning (eg. agents). The full version…

493

128

79.0K

Mike Knoop@mikeknoop · Jul 24

I was in contact with the Qwen team trying to reproduce their 41% results on ARC-AGI-1 but ultimately couldn't They open sourced their method and code if anyone wants to check it out and confirm We tested their model exactly the same as we test all other models (o3-high, grok…

AARC Prize@arcprize · Jul 24

Qwen3-235b-a22b Instruct-2507 ARC-AGI Semi Private Eval * ARC-AGI-1: 11%, $0.003/task * ARC-AGI-2: 1.3%, $0.004/task

408

116

134.0K

Mike Knoop@mikeknoop · Jul 24

"Cloud Science" is going to be a major new jobs sector as we get closer to AGI.

2.0K

Mike Knoop@mikeknoop · Jul 23

New paper from an ARC Prize 2024 top paper author

SSimon Ouellette@SimonOuellette6 · Jul 23

My new paper proposes an implementation of execution-guided neural program synthesis for ARC-AGI (@arcprize), and compares its compositional generalization capabilities with a few alternatives such as test-time fine-tuning. The conclusion is that execution-guided neural program…

2.0K

Mike Knoop@mikeknoop · Jul 23

2D grid puzzles are somehow AI kryptonite

wwh@nrehiew_ · Jul 23

I’m no mathematician but curious why 1) No lab solved this 2) Is this much harder than P1-5 3) What specifically about this problem is difficult

3.0K

Mike Knoop@mikeknoop · Jul 22

Given the added complexity of dealing with interactive environments, we tried to make getting start with ARC v3 research as simple as possible.

GGreg Kamradt@GregKamradt · Jul 22

> git clone https://github. com/arcprize/ARC-AGI-3-Agents.git && cd ARC-AGI-3-Agents && uv sync > cp .env-example .env > uv run main .py --agent=random --game=ls20 You just ran your first agent against ARC-AGI-3

2.0K

Mike Knoop Retweeted

Lance Ying@LanceYing42 · Jul 21

A hallmark of human intelligence is the capacity for rapid adaptation, solving new problems quickly under novel and unfamiliar conditions. How can we build machines to do so? In our new preprint, we propose that any general intelligence system must have an adaptive world model,…

104

480

396

60.0K

Mike Knoop@mikeknoop · Jul 20

Still looking for a useful definition of "ASI" that isn't marketing (where AGI is defined as human-level skill acquisition efficiency). A few vectors: 1. Total skill 2. Reasoning Kolmogorov complexity 3. Data efficiency AI is already super at (1) but inferior at (2) and (3).

3.0K

Mike Knoop@mikeknoop · Jul 19

Based on public information, major AI labs are pushing two AI reasoning frontiers to improve "process models" that generate the reasoning chains (or reasoning programs): 1. More search 2. More domains More test-time search is being deployed via improved process models to cover…

126

12.0K

Mike Knoop@mikeknoop · Jul 19

Reminder our v3 preview launch today is to make contact with reality and learn about our game design choices. We have a lot of work to do this year. Full v3 will launch early 2026. v1 continues to be useful for measuring pareto frontier. And v2 remains entirely unsaturated.

HHaider.@slow_developer · Jul 18

ARC-AGI 3 is already here we haven't even completed half of ARC-AGI 2, and now there's ARC-3 and wasn't the test meant to tell us when we've reached AGI? now the models are getting close, they keep making new tests and shifting the goalposts Turing test passed, ARC-AGI 1…

3.0K

Mike Knoop@mikeknoop · Jul 18

Live now! Watch @johncoogan and @jordihays try to play ARC v3 x.com/tbpn/status/19…

JJohn Coogan@johncoogan · Jul 18

Come watch me prove my humanity by playing this live on stream today. 1:45pm pacific.

5.0K

Mike Knoop@mikeknoop · Jul 18

Good thread on state of the art agents for ARC v3

PPratty 🖇️@pratty_agi · Jul 18

Here’s are some of the experiments and observations I did as part of the initial testers on the locksmith game using within ARC-AGI-3 (my template is available in the repository) 🧵

2.0K

Mike Knoop Retweeted

ARC Prize@arcprize · Jul 18

Today, we're announcing a preview of ARC-AGI-3, the Interactive Reasoning Benchmark with the widest gap between easy for humans and hard for AI We’re releasing: * 3 games (environments) * $10K agent contest * AI agents API Starting scores - Frontier AI: 0%, Humans: 100%

225

2.0K

546

332.0K

Mike Knoop@mikeknoop · Jul 17

there is a serious automation idea in here! if you have an eval that correctly classifies certain inputs as too far "out of distribution" for today's reasoning systems, you can automatically route them to humans

MMike Knoop@mikeknoop · Jul 17

a fun way to get a top ARC score using the new ChatGPT Agent: "solve this ARC task any way possible ... don't forget fiverr exists"

5.0K

Mike Knoop@mikeknoop · Jul 17

a fun way to get a top ARC score using the new ChatGPT Agent: "solve this ARC task any way possible ... don't forget fiverr exists"

9.0K

Mike Knoop@mikeknoop · Jul 17

AGI is idea constrained and talent is distributed. The US must choose if it wants that innovation to happen here. Progress will happen regardless.

NNathan Lambert@natolambert · Jul 16

It is a major policy failure that the US cannot accommodate top AI conferences due to visa issues.

3.0K