Qinan Yu
@qinan_yu
CS PhD @stanfordnlp 馃尣CS-Math Ugrad @Brown_NLP 馃惢
馃Ever wonder why LLMs give inconsistent answers in different languages? In our paper, we identify two failure points in the multilingual factual recall process and propose fixes that guide LLMs to the "right path." This can boost performance by 35% in the weakest language! 馃搱
馃How do multilingual LLMs encode structural similarities across languages? 馃専We find that LLMs use identical circuits when languages share the same morphosyntactic processes. However, they involve specialized components to handle tasks if contain specific linguistic features猡碉笍
Circuit analysis is a common tool in mechanistic interpretability for understanding model behaviors when executing certain tasks. But how well do these findings generalize throughout model training or to models of different sizes?
Excited to share our #ICML2024 paper "Grokking Group Multiplication with Cosets" with @BlancheMinerva, @qinan_yu and @Void13950782! We reverse engineered neural networks that perfectly learned to multiply elements of the symmetric groups S5 & S6. 馃У on our key findings below
Accepted at EMNLP: LLMs often have to integrate information in context with facts learned during pretraining. Sometimes these facts disagree, so how do they handle this competition? We find that we can modulate single attention heads to control which version to use!
Are Chain-of-Thought reasoning chains good "explanations"? Not necessarily, since they aren't always faithful -- and we propose a 2-stage reasoning framework to solve this. See our paper "Faithful Chain-of-Thought Reasoning" at the #NLRSE workshop (1:30pm Thur) at #ACL2023!