Rohit Gandikota
@rohitgandikota
Ph.D. AI @ Northeastern University. Understanding, mapping, and editing knowledge in large generative models. Ex-Scientist Indian Space Research Organization
Can you ask a Diffusion Model to break down a concept? 👀 SliderSpace 🚀 reveals maps of the visual knowledge naturally encoded within diffusion models. It works by decomposing the model's capabilities into intuitive, composable sliders. Here's how 🧵👇
RAG for Wikipedia! Seems like a good way to streamline evaluations when editing knowledge in LLMs
RAGs are extremely useful, and yet there isn't an opensource RAG system for wikipedia (or I couldn't find it). So I built WikiRAG, a simple open-source github + hugging repo✨ Spin up your own RAG server for wikipedia in a single line. 🚗💨
Yes indeed! Here are 3 pieces of advice by @davidbau for PhD students interested in Mech Interp
@davidbau's talk on "What is AI interpretability for?" is about to start! If you know David or his talks in the past - you know it is going to be special🔥 A fan or skeptic of mech Interp? come ask David some questions Location: Grand C1 Hall or Zoom
Mechanistic Interpretability for Vision Workshop has officially begun @CVPR ! 🚀 Join us at Grand C1 Hall for insightful perspectives on the state of interpretability in vision models by @TamarRottShaham.
Come join us at the @miv_cvpr2025 workshop today at the @CVPR in room C1! We have an amazing lineup of speakers 🔊 including @davidbau @soniajoseph_ @trevordarrell @aleks_madry Antonio Torralba and Michal Irani 🙌🏻
🔍 Curious about what's really happening inside vision models? Join us at the First Workshop on Mechanistic Interpretability for Vision (MIV) at @CVPR! 📢 Website: sites.google.com/view/miv-cvpr2… Meet our amazing invited speakers! #CVPR2025 #MIV25 #MechInterp #ComputerVision
Vision transformers have high-norm outliers that hurt performance and distort attention. While prior work removed them by retraining with “register” tokens, we find the mechanism behind outliers and make registers at ✨test-time✨—giving clean features and better performance! 🧵
Artifacts in your attention maps? Forgot to train with registers? Use 𝙩𝙚𝙨𝙩-𝙩𝙞𝙢𝙚 𝙧𝙚𝙜𝙞𝙨𝙩𝙚𝙧𝙨! We find a sparse set of activations set artifact positions. We can shift them anywhere ("Shifted") — even outside the image into an untrained token. Clean maps, no retrain.
Tired of looking at pixels and want to look at some neurons? Come join us @miv_cvpr2025 this Thursday @CVPR in room "Grand C1"
It might be a rainy @CVPR this time, but we at MIV workshop have you covered! Come to Grand C1 hall and listen to our great speakers talk about why mechanistic interpretability is important for vision models Date: June 12th, 9AM Location: Grand C1 hall More info?👇
Super excited about @miv_cvpr2025! What an incredible lineup of invited speakers! 👀 You instantly know that the workshop is going to be 🔥 Come to Grand C1 hall on June 12th
It might be a rainy @CVPR this time, but we at MIV workshop have you covered! Come to Grand C1 hall and listen to our great speakers talk about why mechanistic interpretability is important for vision models Date: June 12th, 9AM Location: Grand C1 hall More info?👇
Is the knowledge of a concept **really** removed when we erase a concept from a diffusion model? @kevinlu4588 found that the answer is often - NO! Checkout these clever yet simple techniques to search for the traces of knowledge in your erased models.
When we "erase" a concept from a diffusion model, is that knowledge truly gone? 🤔 We investigated, and the answer is often 'no'! Using simple probing techniques, the knowledge traces of the erased concept can be easily resurfaced 🔍 Here is what we learned 🧵👇
1. Enhance a concept You can enhance a concept inside a model's knowledge For example, here is UCE enhancing "mustache" inside @vivago_ai's HiDream-l1 model in under 10 secs and 40GB GPU
I used to think formal reasoning was central to language and intelligence, but now I’m not so sure. Wrote a short post about my thoughts on this, with a couple chewy anecdotes. Would love to get some feedback/pointers to further reading. sfeucht.github.io/syllogisms/
@amuuueller presenting sparse feature circuits at #ICLR2025 Also Aaron is starting as faculty at BU in the fall so reach out to him if you’re looking for PhD positions. You’ll get a terrific mentor!
Nice trick to make diffusion model outcomes diverse..
Why do distilled diffusion models generate similar-looking images? 🤔 Our Diffusion Target (DT) visualization reveals the secret to diversity. It is the very first time-step! And—there is a simple, training-free way to make them more diverse! Here is how: 🧵👇
An exciting and up-to-date implementation for text-to-image model editing of our TIME method! Great to see these ideas evolve and get applied on SOTA models. Project page & paper 👇🏻
You can now edit HiDream-I1 model under 10 seconds!🚀 (... FLUX and SDXL under 2 seconds on an RTX4090) We are releasing a simple implementation of our UCE work to support any diffusion model Here are some cool things you can do with this lightning fast editing method 🧵👇
AI systems are sold as black boxes, so you might think it is impossible to understand their thoughts. But you can. And control their knowledge. Like fixing all sorts of unfair biases—heh, even showing that scientists don't actually wear white coats. x.com/rohitgandikota…
3. Erasing a visual association In addition to erasing a concept entirely, you can also editing an association of two concepts. For example, here is UCE erasing the knowledge from @StabilityAI's SDXL model that "scientists wear glasses" in <2 secs with a 24GB GPU
Realize what @rohitgandikota and @OrgadHadas and @materzynska have done here. It goes beyond making an AI that can edit something for you. By understanding how the neurons store knowledge, they let you reach inside and change that knowledge directly. x.com/rohitgandikota…
Those are image editing tools Here, we are editing the model weights. It's like training your own custom model. Now, a common problem with creating custom models is that they take a lot of time and require huge compute GPUs Our method is super fast and requires low compute