Lindia Tjuatja @ ACL 2025
@lltjuatja
a natural language processor and “sensible linguist”. PhD-ing @LTIatCMU + interning @apple, previously BS-ing @UT_linguistics + @utexasece 🤠🤖📖 she/her
When it comes to text prediction, where does one LM outperform another? If you've ever worked on LM evals, you know this question is a lot more complex than it seems. In our new #acl2025 paper, we developed a method to find fine-grained differences between LMs: 🧵1/9

to the person that made the macrodata refinement theme on vscode thank you, you made my day, my refining will be extra productive with this color scheme
Introducing Disentangled Safety Adapters (DSAs) for fast and flexible AI Safety To block harmful responses from an LLM, often a separate LLM called a "safety guardrail" is used to judge their safety. However, to get high quality safety predictions, we need to use reasonably…
committed to doing my part in decreasing reviewer workload by writing fewer papers
📢 today's scaling laws often don't work for predicting downstream task performance. For some pretraining setups, smooth and predictable scaling is the exception, not the rule. a quick read about scaling law fails: 📜arxiv.org/abs/2507.00885 🧵1/5👇
Nice work from @lltjuatja and @gneubig on using SAEs to describe fine-grained differences between the outputs of different language models. SAEs are valuable if you know where to use them!
When it comes to text prediction, where does one LM outperform another? If you've ever worked on LM evals, you know this question is a lot more complex than it seems. In our new #acl2025 paper, we developed a method to find fine-grained differences between LMs: 🧵1/9
Where does one language model outperform the other? We examine this from first principles, performing unsupervised discovery of "abilities" that one model has and the other does not. Results show interesting differences between model classes, sizes and pre-/post-training.
When it comes to text prediction, where does one LM outperform another? If you've ever worked on LM evals, you know this question is a lot more complex than it seems. In our new #acl2025 paper, we developed a method to find fine-grained differences between LMs: 🧵1/9