Qinyuan Ye

@qinyuan_ye

👩‍💻 Ph.D. student @nlp_usc | 🐾 Teaching machines to be versatile and curious.

Los Angeles, CA

Joined September 2019

1KFollowing

2KFollowers

Qinyuan Ye Retweeted

Johnny Tian-Zheng Wei@johntzwei · Jul 10

Are you a researcher, trying to build a small GPU cluster? Did you already build one, and it sucks? I manage USC NLP’s GPU cluster and I’m happy to offer my expertise. I hope I can save you some headaches and make some friends. Please reach out!

8.0K

Qinyuan Ye Retweeted

Mayee Chen@MayeeChen · Jun 24

LLMs often generate correct answers but struggle to select them. Weaver tackles this by combining many weak verifiers (reward models, LM judges) into a stronger signal using statistical tools from Weak Supervision—matching o3-mini-level accuracy with much cheaper models! 📊

236

158

22.0K

Qinyuan Ye Retweeted

Harvey Yiyun Fu@harveyiyun · Jun 18

LLMs excel at finding surprising “needles” in very long documents, but can they detect when information is conspicuously missing? 🫥AbsenceBench🫥 shows that even SoTA LLMs struggle on this task, suggesting that LLMs have trouble perceiving “negative space” in documents. paper:…

158

25.0K

Qinyuan Ye Retweeted

Brihi Joshi@BrihiJ · Apr 29

Reasoning about the "why" behind user behavior can improve LLM personas! ✨🧠📈 📝Excited to share our new work: Improving LLM Personas via Rationalization with Psychological Scaffolds 🔗 arxiv.org/abs/2504.17993 🧵 (1/n)

131

10.0K

Qinyuan Ye Retweeted

Tianyi Lorena Yan@LorenaYannnnn · Mar 26

When answering queries with multiple answers (e.g., listing cities of a country), how do LMs simultaneously recall knowledge and avoid repeating themselves? 🚀 Excited to share our latest work with @robinomial! We uncover a promote-then-suppress mechanism: LMs first recall all…

105

9.0K

Qinyuan Ye Retweeted

Johnny Tian-Zheng Wei@johntzwei · Feb 25

Many works addressing copyright for LLMs focus on model outputs and their similarity to copyrighted training data, but few focus on how the model was trained. We analyze LLM memorization w.r.t. their training decisions and theorize on its use in court arxiv.org/abs/2502.16290

6.0K

Qinyuan Ye Retweeted

Huihan Li@huihan_li · Jan 22

Accepted @iclr_conf🤩We build a pretraining corpora attribution framework that determines whether an entity is associated with a culture through memorization or other driving factors, and analyze whether such associations are related to pretraining data distribution. #ICLR2025

11.0K

Qinyuan Ye Retweeted

Wenting Zhao@wzhao_nlp · Jan 6

📣Announcing VerifAI: AI Verification in the Wild, a workshop at #ICLR2025 VerifAI will gather researchers to explore topics at the intersection of genAI/trustworthyML and verification: verifai-workshop.github.io @celine_ylee @theo_olausson @ameeshsh @wellecks @taoyds

30.0K