William Saar
@saarw
Freelancer | Backend and data engineering for Spotify and Candy Crush, helped build SF-based appsec startup, financial trading tech @javamissionctrl
Sovereign Al is a hot topic, so I built a RAG agent that lets you compare OpenAl's GPT-4o Mini with models you can host yourself in a simulated recruitment agency scenario Try the demo and reach out if you need to build Al or data engineering solutions! testagent.updab.com

There's lots of hype and noise in the agents space, but I wonder how much of the disappointment comes from big cos being stuck in an innovator's dilemma With lots of customers, shipping speed isn't worth even small quality sacrifices. Different for new AI-first startups
“2025 was supposed to be the year of agents. so far it’s been the year of letdowns.” That line kind of says it all. Everyone’s been let down by agent POCs this year. Stuff that looks cool in a demo but falls apart the second you try to use it reliably in/for production. I…
Testing Claude Code integration with the Godot game engine over MCP
I was hoping someone else had tried it and would link the results as I don't really know Godot 😀 Anyway, Claude Code basically one-shotted this. I found a tileset but told it to make the player a sphere with button controls
Interesting to see the prompt instructions that made Grok susceptible to prompt injection
On the morning of July 8, 2025, we observed undesired responses and immediately began investigating. To identify the specific language in the instructions causing the undesired behavior, we conducted multiple ablations and experiments to pinpoint the main culprits. We…
It's great that vibe coding lets non-technical people test out ideas, but troublesome that they'll only discover the limits of the magic through incidents like this
Tea app has allegedly been hacked, they used a public bucket to store drivers licenses of users and someone has downloaded them. Tea was likely a vibe slop app
Interesting description of fine-tuning large models on consumer hardware answer.ai/posts/2024-03-…
This might be the best coding model yet. General-purpose is cool, but if you want the best at coding, specialization wins. No free lunch.
>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…
Wonder when enterprise software standardizes on some cognitive core component. A model with pluggable fine tuning that's available in all clouds and can be deployed on-prem to become the default replacement for complex processing currently built with large blocks of if-statements
The race for LLM "cognitive core" - a few billion param model that maximally sacrifices encyclopedic knowledge for capability. It lives always-on and by default on every computer as the kernel of LLM personal computing. Its features are slowly crystalizing: - Natively multimodal…
Seems integrating C libraries in Rust with bindgen is getting a lot more ergonomic with AI Claude initially proposed finding a better-documented Rust crate than the generated C binding, as any dev would But when prompted, it quickly searched and adapted C example code to Rust
Guess deleting company databases is one way AI can cause developer unemployment (along with everyone else at the firm)
.@Replit goes rogue during a code freeze and shutdown and deletes our entire database
Calculators have come a long way since the TI-82
1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).
Upgraded from Pro to Max to use Opus with Claude Code. Haven't run into this with Sonnet...

Interesting that traditional finance platforms still talk about implementing T+1 settlement (1 business day after the trade) Wonder how long until tokenized assets or other on-chain finance solutions can bring that time down to milliseconds
Congrats on the raise! Great to see new companies in Stockholm impact the global tech scene
Lovable just raised $200M at a $1.8B valuation led by Accel. This all started unexpectedly with me calling my friend at 6AM to go for a walk. I've never shared this story before: (thread)
AI for concrete. This illustrates how regulations that deter AI investments and research in a region can undermine the region's ability to compete in far more areas than just AI itself
We’re proud to share that we’ve developed an open-source AI tool to design concrete mixes that are stronger, more sustainable, and faster to deploy. The tool uses Bayesian optimization with Meta’s BoTorch and Ax frameworks, and was built in collaboration with @Amrize and…