Ashwinee Panda

@PandaAshwinee

Postdoc of @tomgoldsteincs, PhD @princeton, @Cal alum, currently working on LLMs

Joined February 2020

732Following

3KFollowers

Pinned

Ashwinee Panda@PandaAshwinee · Apr 23

thrilled to receive the outstanding paper award for our work on shallow alignment! i’ll be giving the talk at 10:42am tomorrow (Thursday) in oral session 1D. the poster will be Friday 3PM.

IICLR 2026@iclr_conf · Apr 23

Outstanding Papers Safety Alignment Should be Made More Than Just a Few Tokens Deep. Xiangyu Qi, et al. Learning Dynamics of LLM Finetuning. Yi Ren and Danica J. Sutherland. AlphaEdit: Null-Space Constrained Model Editing for Language Models. Junfeng Fang, et al.

170

22.0K

Pinned

Ashwinee Panda@PandaAshwinee · Jul 25

master stroke by @kellerjordan0 and co. to not name their adam-killer anything that could be confused with a human name, this avoiding this issue entirely

YYiping Lu@2prime_PKU · Jul 25

Anyone knows adam?

803

Pinned

Ashwinee Panda@PandaAshwinee · Jul 11

i once failed an interview bc instead of giving the standard answer to “why do LLMs need tokenization” i decided to say what i really think, which is “i don’t think they really need it…” very glad to see this arch that albert has been hyping up, and excited to try it out

AAlbert Gu@_albertgu · Jul 11

Tokenization is just a special case of "chunking" - building low-level data into high-level abstractions - which is in turn fundamental to intelligence. Our new architecture, which enables hierarchical *dynamic chunking*, is not only tokenizer-free, but simply scales better.

180

15.0K

Ashwinee Panda@PandaAshwinee · Jul 25

Norman’s team is working on some seriously hard problems -but they have a ton of resources and a lot of really smart people to work with. This is a super exciting team to work with for sure!

NNorman Mu@TheNormanMu · Jul 25

I'm hiring for our AI safety team at xAI! We urgently need strong engineers/researchers to work across all stages of the the frontier AI development cycle: data, training, evals, and product 1. job-boards.greenhouse.io/xai/jobs/47992… 2. job-boards.greenhouse.io/xai/jobs/47992…

1.0K

Ashwinee Panda@PandaAshwinee · Jul 23

well well well look how the turntables

AAshwinee Panda@PandaAshwinee · Nov 18

DO NOT DO THIS. I have previously raised this for Ethics Review when I saw it in a paper. You are not sneaky.

3.0K

Ashwinee Panda@PandaAshwinee · Jul 10

i have a new SOTA algorithm for generating kernels 1. claim that your proposed algorithm beats torch.compile 2. wait for horace to give you the magic incantation that will make torch.compile generate a better kernel

HHorace He@cHHillee · Jul 10

Pretty cool, especially for long sequences! I will note that you can pretty easily get much better numbers for torch.compile that are much closer for sequences up to about 16384. A couple things: 1. By default torch.compile generates dynamic-shapes kernels when benchmarked…

178

13.0K

Ashwinee Panda@PandaAshwinee · Jul 10

what a fascinating study! it does seem like most people have moved on from the "i work with Cursor" to "i tell Claude Code to do something and then swap to another tab", i wonder how that would impact things.

MMETR@METR_Evals · Jul 10

We ran a randomized controlled trial to see how much AI coding tools speed up experienced open-source developers. The results surprised us: Developers thought they were 20% faster with AI tools, but they were actually 19% slower when they had access to AI than when they didn't.

991

Ashwinee Panda@PandaAshwinee · Jul 8

i really like this blog because reading it feels like having a conversation with albert; specifically, statements like "I’m driven by aesthetics much more than the average person, I’d guess" i'm excited to see this new architecture that i've been hearing so much about!

AAlbert Gu@_albertgu · Jul 8

I converted one of my favorite talks I've given over the past year into a blog post. "On the Tradeoffs of SSMs and Transformers" (or: tokens are bullshit) In a few days, we'll release what I believe is the next major advance for architectures.

2.0K

Ashwinee Panda@PandaAshwinee · Jul 8

we’ll be presenting LoRI at #COLM2025!

JJuzheng Zhang@juzheng_z · Apr 15

🚨 How much parameter redundancy does LoRA really contain? We introduce LoRI, a method that keeps performance strong—even when we drastically shrink trainable parameters of LoRA. 🧵1/N

2.0K

Ashwinee Panda@PandaAshwinee · Jul 1

"someone of Ilya Sutskever's capabilities" i've got just the guy...

TTBPN@tbpn · Jun 30

We asked @kyliebytes (Senior Correspondent @WIRED) about Meta's hiring strategy and the challenges of attracting top talent. "I'll be impressed if they get someone of Ilya Sutskevar's capabilities." "The people that want to build super-intelligence are true believers." "The…

1.0K

Ashwinee Panda@PandaAshwinee · Jun 30

> progress is based on real-world experiments rather than raw intelligence the tech that people are cooking up now is based on the insights from deploying models. if GPT-5 can't actually deploy its creations, how is it going to figure out what is needed for GPT-6? evals?

JJason Wei@_jasonwei · Jun 30

We don’t have AI self-improves yet, and when we do it will be a game-changer. With more wisdom now compared to the GPT-4 days, it's obvious that it will not be a “fast takeoff”, but rather extremely gradual across many years, probably a decade. The first thing to know is that…

837

Ashwinee Panda@PandaAshwinee · Jun 30

gemini diffusion

1.0K