Rory Watts
@RoryWalshWatts
CEO of Forecast Health, a global health consultancy which builds mathematical models and software.
Current LLM stack (20 June 2025): - Claude Code for 95% of coding - Gemini 2.5 Pro for 5% where I need to dump a few codebases into the context - Planning - o3 Pro on OpenRouter
Again, repping @OpenRouterAI - Uncertainty in old monolithic repo. Gemini Pro not being clear, hallucinating - Chat with o3-Pro for 90 minutes, spend a few $ - Reads Gemini's assessment, pushes back, is correct - Fully understand the issues, avoided 8-10 hours of debugging
Has anyone made a "written by chatGPT" bingo card yet? Em-dashes, "it's not X, it's Y", what else?
This “just cook bro” mentality is peak broke-brain logic. You think spending an hour every day chopping onions and scrubbing pans is some badge of honor? Congrats, you saved $7 and burned the only free hour you had after work. Hope the lentils were worth it. Cooking isn’t free.…
Feeling the 2 Watch Problem vibes: Gemini - Your assessment is incorrect o3-Pro - Your assessment is correct *Present Gemini's assessment to o3-Pro* o3-Pro - My assessment is correct still *Present o3-Pro's assessment to Gemini* Gemini - My assessment was incorrect
5 years from now, hacking will be like in the 90s when you just had to telnet into something and login with admin:admin.
vibe code so hard, your entire waitlist is visible in frontend.
Gemini 2.5 Pro is good but extremely frustrating to work with. It's tendency to start every response with "You are absolutely correct" makes me completely distrust it, and its tendency to "use analogies" e.g. "think of this <complex function> as the "factory floor", spare me
The use of Grok to support individual beliefs is rife and terrifying
🚨NEW: Scott Adams says that he was in so much pain that he had planned to euthanize himself today. He recently started taking a new hormone/testosterone blocker and says that he's currently pain free and he hopes that this can add a few years to his life.
This is the first month in about a year where I haven't (yet) renewed my openAI subscription. Feels more difficult to justify when - claude code is so good - gemini is very good at everything else (deep research) This will probably change in about a day of course
After a few years with LLMs it's become clear to me that my ideal way to ingest new information is "2 - 3 paragraphs, clearly worded, no jargon" The default output of bold, lists, tables is littered with things grabbing for attention.
I think about this chart often.
I think about this chart often.
A strange conflict I now have when using @claude_code and other AIs is this: - I get far more done than preChatGPT era - I feel like I haven't done as much I definitely need to calibrate better
It's very cute when @claude_code writes a migration plan that still uses 2022 estimates of the time needed e.g. Week 1 Migrate API You realise we are now doing this all in one afternoon, right?
Hey @_catwu are there plans to bring planning mode to Claude Code users who pay via API rather than subscription? Thanks!
I don't think I give @OpenAI's 4o enough credit as a daily driver. It is a genuinely great model for most everyday things. Thanks!
Gemini 2.5 Pro 05-06 seems very reluctant to output transformed data e.g. here's a CSV can you change its formatting. It almost always defaults to writing a python script to do it instead.
Me, talking to Gemini 2.5: "How do I do extremely-obtuse-idiosyncratic-thing-in-my-codebase" Gemini: *thinking* Gemini: Okay! This is actually a super common issue when dealing with extremely-obtuse-idiosyncratic-thing-in-codebases. Is this sycophancy?