Qin Liu
@QinLiu_NLP
PhD student @UC_Davis | MS & BA @FudanUni | AI safety and Trustworthy LLMs
🎉 Excited to share our ACL 2025 paper: 🤖R2D2: Remembering, Replaying and Dynamic Decision Making with a Reflective Agentic Memory 🧠 📄 Paper: arxiv.org/abs/2501.12485 📍Poster: Hall 4/5, Session 4 Wednesday, July 30 11:00-12:30 🧵👇
@ReviewAcl @emnlpmeeting Urgent help needed. acFZ: initial score 3 🧊 Complete silence during discussion. ⏰ 4am PST, 9 min before deadline: quietly drops to 2. with “Thanks for the rebuttal. I have updated the score.” ⚠️ No explanation. No notice. No chance to respond. (0/n)
🔍 Introducing QA-LIGN: A reflective alignment approach using a draft→reflection→revision pipeline. We create symbolic reward models that serve as both natural language critics & general reward models, bridging rule-based rewards and RLAIF. 📄 Paper: arxiv.org/pdf/2506.08123
😴 Extending modality based on an LLM has been a common practice when we are talking about multimodal LLMs. ❓ Can it generalize to omni-modality? We study the effects of extending modality and ask three questions: arxiv.org/abs/2506.01872 #LLM #MLLM #OmniModality
Can LLM guardrails think twice before deciding? ✨ Check out our #ACL2025 paper: THINKGUARD — a critique-augmented safety guardrail! ✅ Structured critiques ✅ Interpretable decisions ✅ Robust against adversarial prompts 📑 arxiv.org/abs/2502.13458 🧵[1/n]
🧵1/ Excited to share our #NAACL2025 work! 🎉 "Assessing LLMs for Zero-Shot Abstractive Summarization Through the Lens of Relevance Paraphrasing" We study how robust LLM summarization is to our relevance paraphrasing method? 🧠📝 More details below:👇 arxiv.org/abs/2406.03993
Worried about backdoors in LLMs? 🌟 Check out our #NAACL2025 work on test-time backdoor mitigation! ✅ Black-box 📦 ✅ Plug-and-play 🛡️ We explore: → Defensive Demonstrations 🧪 → Self-generated Prefixes 🧩 → Self-refinement ✍️ 📄 arxiv.org/abs/2311.09763 🧵[1/n]
🎉 Excited to share that our paper, "MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding", will be presented at #ICLR2025! 📅 Date: April 24 🕒 Time: 3:00 PM 📍 Location: Hall 3 + Hall 2B #11 MuirBench challenges multimodal LLMs with diverse multi-image…
🚨 Call for Papers! @aclmeeting 🚨 LLM Security Workshop @ ACL 2025 (the first workshop of ACL SIGSEC) 🔐 Topics: Adversarial attacks, defenses, vulnerabilities, ethical & legal aspects, safe deployment of LLMs and more 📅 Submission Deadline: April 15, 2025 📍 August 1, 2025 in…
🚀 Excited to share MetaScale, our latest work advancing LLM reasoning capabilities! MetaScale empowers GPT-4o to match or even surpass frontier reasoning models like o1, Claude-3.5-Sonnet, and o1-mini on the challenging Arena-Hard benchmark (@lmarena_ai). Additionally, MetaScale…
🚀 Introducing 𝗦𝗲𝗮𝗿𝗰𝗵-𝗥𝟭 – the first 𝗿𝗲𝗽𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻 𝗼𝗳 𝗗𝗲𝗲𝗽𝘀𝗲𝗲𝗸-𝗥𝟭 (𝘇𝗲𝗿𝗼) for training reasoning and search-augmented LLM agents with reinforcement learning! This is a step towards training an 𝗼𝗽𝗲𝗻-𝘀𝗼𝘂𝗿𝗰𝗲 𝗢𝗽𝗲𝗻𝗔𝗜 “𝗗𝗲𝗲𝗽…
🌟 Check out our latest comprehensive survey on: 🌟 ⚠️Emergent backdoor threats to LLMs 👻Safety challenges to LLMs 💡Future research directions in this area Invited paper at 60th Annual Allerton Conference: ieeexplore.ieee.org/abstract/docum…


Excited to present our #EMNLP2024 paper “Securing Multi-turn Conversational Language Model’s from Distributed Backdoor Triggers”, a new threat framework to chat models (like ChatGPT) that is compatible with other backdoor triggers to complexify defense.