Zheng Zhao@ACL2025
@zhengzhao97
PhD Student @Edin_CDT_NLP @edinburghnlp | intern @AIatMeta | former intern @amazon | working on LLMs
I’ll be sharing two posters at #ACL2025NLP! 🎉 🗓 Mon (28 Jul): VISTA: A Video-to-Text Summarization Dataset for Scientific Talks 🗓 Wed (30 Jul): Discourse-Driven Planning for Explanatory Summarization 📍 ALL at Hall 4/5 Come by to chat about #Summarization, and beyond! 🤖🧑🔬
The amazing folks at @EdinburghNLP will be presenting a few papers at ACL 2025 (@aclmeeting); if you're in Vienna, touch base with them! Here are the papers in the main track 🧵
Heading to ACL @aclmeeting. I will, 1. Present my paper done with @Apple, “Eliciting In-context Retrieval and Reasoning for Long-Context Language Models” at 18:00 - 19:30, Monday, 27th July, Hall 4/5 2. Be in @AIatMeta booth at 1-2PM, Tuesday, 28th July. Chat if you’re around.
🔁 What if you could bootstrap a world model (state1 × action → state2) using a much easier-to-train dynamics model (state1 × state2 → action) in a generalist VLM? 💡 We show how a dynamics model can generate synthetic trajectories & serve for inference-time verification 🧵👇
Next week I'll be in Vienna to present our work about the interplay between SES and LLM usage 🇦🇹 I will present it on Wednesday morning and also take part in a panel at the end of the session. If you're interested in the topic, feel free to reach out 🤗 #ACL2025NLP #NLProc
📄The AI Gap: How Socioeconomic Status Affects Language Technology Interactions 👥 @EliBassignana * @CurriedAmanda * Dirk Hovy 🔗 arxiv.org/pdf/2505.12158 🎯We call for inclusive NLP technologies to accommodate different SES and mitigate the digital divide.
Transformer vs. Mixture of Experts in LLMs, clearly explained (with visuals):
🚀 Introducing Prefix-RFT to blend SFT and RFT! SFT can learn more complex problems by mimicking, but can have poor generalization. RFT has better overall performance but is limited by the initial policy. Our method, Prefix-RFT, makes the best of both worlds!
🚨 Long Paper Accepted at @aclmeeting 2025 main conference! 🚨 🎥 Our work "What Is That Talk About? A Video-to-Text Summarization Dataset for Scientific Presentations" introduces VISTA, a large-scale benchmark for scientific video summarization. #ACL2025 #NLProc #LLMs 🧵(1/3)
Can multimodal LLMs truly understand research poster images?📊 🚀 We introduce PosterSum—a new multimodal benchmark for scientific poster summarization! 🪧 📂 Dataset: huggingface.co/datasets/rohit… 📜 Paper: arxiv.org/abs/2502.17540
🚀Happy to share my last-year Apple's internship work! A promising use case of long-context LLMs is enabling the entire knowledge base to fit in the prompt as contextual knowledge for tasks like QA, rather than RAG pipeline. But are they up to this? If not, how to improve?
🎉 Excited to share “Generalizing from Short to Long: Effective Data Synthesis for Long-Context Instruction Tuning” 📄 (arxiv.org/pdf/2502.15592) We propose "context synthesis": instead of generating instructions from long texts, we synthesize contexts for instructions—drawing…
'Are We Done with MMLU?' Despite its widespread use, our analysis of the MMLU benchmark uncovers significant ground truth errors that obscure LLM performance. For instance, 57% of questions in the Virology subset of MMLU contain errors. To address this, we propose a novel error…
🚨What’s the worst answer to preventing Ebola outbreaks? 🤔 "Arrange rapid military intervention into West Africa with EU & USA army teams." 🤦♂️ That’s from the original MMLU dataset! Check out MMLU-Redux led by @aryopg #NAACL2025
'Are We Done with MMLU?' Despite its widespread use, our analysis of the MMLU benchmark uncovers significant ground truth errors that obscure LLM performance. For instance, 57% of questions in the Virology subset of MMLU contain errors. To address this, we propose a novel error…
I will present 🌊 Spectral Editing of Activations paper with @zhengzhao97 tomorrow in #NeurIPS2024 . Stop by and say 👋 📍 East Hall A-C #3200 Poster ⏰ 11AM-2PM Looking forward to chatting more about LLM’s hallucination/fairness/grounding with physical world.
[1/n] Can we align LLMs with gradient-free methods? We propose Spectral Editing of Activations (SEA). SEA edits activations during inference with projections found via a closed-form solution. With SEA, we improve the truthfulness and fairness of LLMs. arxiv.org/abs/2405.09719