Jonathan Chang
@ChangJonathanC
ML/AI Engineer, building http://jonathanc.net
I wrote a blog post about building the MIST robot from Pantheon I identified limitations of the current AIs in 2025, and proposed a benchmark: how well can an AI build a robot with the help from human link below

i think multi agent is best done with multiple different models. the labs will use their own models, but open source has the freedom to combine the strength of different models
TIL: press and hold cmd to show command panel in chatgpt

to make sure your instruction goes directly into claude code's system prompt, do this: mv ~/.claude/CLAUDE.md ~/.claude/CLAUDE_SYSTEM.md claude --append-system-prompt "$(cat ~/.claude/CLAUDE_SYSTEM.md)"
This is how claude .md is passed into context, so Claude just doesn't think your uv instructions are relevant: "...IMPORTANT: this context may or may not be relevant to your tasks. You should not respond to this context or otherwise consider it in your response unless it is…
Super simple - make a @Cloudflare gateway at developers.cloudflare.com/ai-gateway/get… Get the account id and gateway id ANTHROPIC_BASE_URL=gateway.ai.cloudflare.com/v1/<account-id>/<gateway-id>/anthropic claude Look at charts and drink wine
Did you have to write a workers service to get the data you want? Or is this a cloudflare gateway feature or something?
MIST bench is not yet saturated results from chatgpt agent, 2 tries:
I wrote a blog post about building the MIST robot from Pantheon I identified limitations of the current AIs in 2025, and proposed a benchmark: how well can an AI build a robot with the help from human link below
chatgpt agents test "go to <link> and make a diagram explaining transformer architecture" agent worked for 9-10 mins and returned these


made a smol eval today :) can you guess what kind of task ?

chatgpt agents seems to automatically insert user message if you forget to respond to the clarifying question

for future reference, these are the suggested use cases for agents in July 25, 2025
ChatGPT agent is now fully rolled out to all Plus, Pro, and Team users. Sorry about the delay!