Teddi Worledge
@TeddiWorledge
(she/her) Computer Science PhD Student @Stanford. Formerly @Berkeley.
🧵LLMs are great at synthesizing info, but unreliable at citing sources. Search engines are the opposite. What lies between them? Our new paper runs human evals on 7 systems across the✨extractive-abstractive spectrum✨for utility, citation quality, time-to-verify, & fluency!

Prior work has used LLMs to simulate survey responses, yet their ability to match the distribution of views remains uncertain. Our new paper [arxiv.org/pdf/2411.05403] introduces a benchmark to evaluate how distributionally aligned LLMs are with human opinions. 🧵
Can interpretability help defend LLMs? We find we can reshape activations while preserving a model’s behavior. This lets us attack latent-space defenses, from SAEs and probes to Circuit Breakers. We can attack so precisely that we make a harmfulness probe output this QR code. 🧵
Many providers offer inference APIs for the same models: for example, there were over nine Llama-3 8B APIs in Summer 2024. Do all of these APIs serve the same completion distribution as the original model? In our new paper, ✨Model Equality Testing: Which Model is This API…
Announcing a deadline extension for the ATTRIB workshop! Submissions are now due September 25th, with an option to submit October 4th if at least one paper author volunteers to be an emergency reviewer. More info here: attrib-workshop.cc
come to twitter for the presidential debate gossip. stay for the perplexity correlations.
Do you want to select great LLM pretraining data but don’t have 1000 H100s for a ton of mixture experiments? What about a method that requires none of your own training, matches the best known existing method, and has some nice theory? New preprint: Perplexity Correlations
Automating AI research is exciting! But can LLMs actually produce novel, expert-level research ideas? After a year-long study, we obtained the first statistically significant conclusion: LLM-generated ideas are more novel than ideas written by expert human researchers.
I’m fighting… against vague notions of LLM attributions. 😤 Check out our paper (w. @TeddiWorledge, Nicole, Caleb and Carlos) here: arxiv.org/abs/2311.12233
In today's last session on forensic analysis of ML systems, @judyhshen kicks it off by presenting their SoK on attribution in LLMs.
What happens when RAG models are provided with documents that have conflicting information? In our new paper, we study how LLMs answer subjective, contentious, and conflicting queries in real-world retrieval-augmented situations.
if you care about pruning LLMs, you should check out our new paper!! this was a fun project, and am grateful to have gotten the chance to work with this fantastic group of people see the thread below for more👇
Do LLMs really need to be so L? That's a rejected title for a new paper w/ @Andr3yGR, @kushal_tirumala, @Hasan_Shap, @PaoloGlorioso1 on pruning open-weight LLMs: we can remove up to *half* the layers of Llama-2 70B w/ essentially no impact on performance on QA benchmarks. 1/
Got a pipeline with **multiple prompts**, like a DSPy program? What's the right way to jointly optimize these prompts? Introducing MIPRO, a Multi-prompt Instruction Proposal Optimizer. We integrated MIPRO into DSPy. It can deliver +11% gains over existing DSPy optimizers! 🧵👇
What types of attributions do modern LLM applications require? Check out our contributed talk [Friday, 10:30am] by @TeddiWorledge at the ATTRIB23 workshop [Rm 271-273] on "Unifying Corroborative and Contributive Attributions in Large Language Models" arxiv.org/abs/2311.12233