Myra Deng
@myra_deng
aligning models @goodfireAI, prev @stanford and @twosigma
Has anyone used any good AI shopping assistants? I’ve tried @shopondaydream, deep research but couldn’t get either to work for me. Maybe this is a sign to stop buying clothes :/
I wrote some fiction in the style of AI 2027. It combines the parts of AI 2027 and AI as Normal Technology that resonate with me most. Come for the predictions, stay for the animal parables!
Just wrote a piece on why I believe interpretability is AI’s most important frontier - we're building the most powerful technology in history, but still can't reliably engineer or understand our models. With rapidly improving model capabilities, interpretability is more urgent,…
Just wrote a piece on why I believe interpretability is AI’s most important frontier - we're building the most powerful technology in history, but still can't reliably engineer or understand our models. With rapidly improving model capabilities, interpretability is more urgent,…
(1/7) New research: how can we understand how an AI model actually works? Our method, SPD, decomposes the *parameters* of neural networks, rather than their activations - akin to understanding a program by reverse-engineering the source code vs. inspecting runtime behavior.
A few months ago, we published Attribution-based parameter decomposition -- a method for decomposing a network's parameters for interpretability. But it was janky and didn't scale. Today, we published a new, better algorithm called 🔶Stochastic Parameter Decomposition!🔶
New research update! We replicated @AnthropicAI's circuit tracing methods to test if they can recover a known, simple transformer mechanism.
"[Deep] unsupervised learning looked worse until suddenly it looked better. We think interpretability is likely to follow a similar arc... Each new item on the tech tree unlocks new questions, new ways to look at the problem. We're building toward the breakthrough"
the latent space is vast (so much larger than a text box!) and this lets you literally paint with it. i haven't felt quite this way since i saw melody interpolation with musicVAE. a good a time as any to announce im joining goodfire! i'm really excited about what we're working on
We created a canvas that plugs into an image model’s brain. You can use it to generate images in real-time by painting with the latent concepts the model has learned. Try out Paint with Ember for yourself 👇
painting > prompting excited for this to be public soon! more to share throughout the week