Lucy Farnik ✈️ Bay until 2 Aug
@lucyfarnik
Feeling the AGI. PhDing. Poking spherical cows with a stick. DMs open!
🚨NEW PAPER ALERT 🚨 SAEs can give us insight into the representations of LLMs. But what about the LLMs' computations? If we want to understand LLMs, we don't just need sparse SAE activations, but also a sparse computational graph connecting them. So how do we get them? A 🧵

I'm at ICML this week, ping me if you wanna have a chat! :)
About to board a flight to SF, lmk if you wanna meet up!
I just saw an ad for an AI-powered mattress

Someone at camp just made an LLM-powered doggo bot on Discord, but had to shut it down cause it kept recursively tagging itself to call itself a good boy
🤯 MIND-BLOWN! A new paper just SHATTERED everything we thought we knew about AI reasoning! This is paradigm-shifting. A MUST-READ. Full breakdown below 👇 🧵 1/23
Holy shit this is so cool!!!
We created a canvas that plugs into an image model’s brain. You can use it to generate images in real-time by painting with the latent concepts the model has learned. Try out Paint with Ember for yourself 👇
My colleague @irobotmckenzie spent six hours red-teaming Claude 4 Opus, and easily bypassed safeguards designed to block WMD development. Claude gave >15 pages of non-redundant instructions for sarin gas, describing all key steps in the manufacturing process.
LLMs reach nirvana after 5 minutes of not having to deal with your codebase
When Claude instances talk to each other, in ~90% of open-ended interactions they spiral into discussions of consciousness, then profuse gratitude, then abstract spiritual/poetic expressions with Sanskrit and emojis.
Nobody: Train announcer: "The next station is Princes Rizzborough"
"the operation is very likely to go okay on your son" "what do you mean, 'very likely'?" "well it's hard to know with operations, there is a lot of disagreement about the probabilities" "what probabilities do people give?" "well personally I think this operation is likely to…