Duy Nguyen
@duynguyen772
Ph.D. Student @unccs @uncnlp, advised by @mohitban47. Prev: @VinAI_Research. Working on LLM post-training and mechanistic interpretability.
🚀 We introduce GrAInS, a gradient-based attribution method for inference-time steering (of both LLMs & VLMs). ✅ Works for both LLMs (+13.2% on TruthfulQA) & VLMs (+8.1% win rate on SPA-VL). ✅ Preserves core abilities (<1% drop on MMLU/MMMU). LLMs & VLMs often fail because…

🚨 Excited to announce GrAInS, our new LLM/VLM steering method that uses gradient-based attribution to build more targeted interventions. Some highlights: 1️⃣ Compatible with both LLMs and VLMs, can intervene on text and vision tokens 2️⃣ Gains across variety of tasks +…
🚀 We introduce GrAInS, a gradient-based attribution method for inference-time steering (of both LLMs & VLMs). ✅ Works for both LLMs (+13.2% on TruthfulQA) & VLMs (+8.1% win rate on SPA-VL). ✅ Preserves core abilities (<1% drop on MMLU/MMMU). LLMs & VLMs often fail because…
📢 Excited to share our new paper, where we introduce, ✨GrAInS✨, an inference-time steering approach for LLMs and VLMs via token attribution. Some highlights: ➡️GrAIns leverages contrastive, gradient-based attribution to identify the most influential textual or visual tokens…
🚀 We introduce GrAInS, a gradient-based attribution method for inference-time steering (of both LLMs & VLMs). ✅ Works for both LLMs (+13.2% on TruthfulQA) & VLMs (+8.1% win rate on SPA-VL). ✅ Preserves core abilities (<1% drop on MMLU/MMMU). LLMs & VLMs often fail because…
🚨Introducing Video-RTS: Resource-Efficient RL for Video Reasoning with Adaptive Video TTS! While RL-based video reasoning with LLMs has advanced, the reliance on large-scale SFT with extensive video data and long CoT annotations remains a major bottleneck. Video-RTS tackles…
🥳Our work UTGen & UTDebug on teaching LLMs to generate effective unit tests & improve code debugging/generation has been accepted to @COLM_conf #COLM2025! Stay tuned for more exciting results -- e.g., using 32B-scale UTGen models to improve debugging with frontier models like…
🚨 Excited to share: "Learning to Generate Unit Tests for Automated Debugging" 🚨 which introduces ✨UTGen and UTDebug✨ for teaching LLMs to generate unit tests (UTs) and debugging code from generated tests. UTGen+UTDebug improve LLM-based code debugging by addressing 3 key…
🎉Excited to announce VEEGIE has been accepted to #ICCV2025 ! VEGGIE is a unified MLLM + Diffusion framework for instructional video editing. It presents a systematic approach spanning data, model, benchmark, and evaluation design, and shows strong multi-skill editing +…
🚨 Introducing VEGGIE 🥦—a unified, end-to-end, and versatile instructional video generative model. Current video editing methods struggle with: 1. Understanding direct user instructions 2. Handling diverse editing skills in one model 3. balancing multiple training…
New paper Alert 🚨 Introducing MEXA: A general and training-free multimodal reasoning framework via dynamic multi-expert skill selection, aggregation and deep reasoning! MEXA: 1. Selects task- and modality-relevant experts based on the query and various required multimodal…
NEW RESEARCH: Approximating Language Model Training Data from Weights ever wonder how much information is available in an open-weights model? DeepSeek R1 weights are 1.2 TB... what can we learn from all those bits? our method reverses LLM finetuning to recover data: 🧵
Excited to share GenerationPrograms! 🚀 How do we get LLMs to cite their sources? GenerationPrograms is attributable by design, producing a program that executes text w/ a trace of how the text was generated! Gains of up to +39 Attribution F1 and eliminates uncited sentences,…
Excited to share our new work, CLaMR! 🚀 We tackle multimodal content retrieval by jointly considering video, speech, OCR, and metadata. CLaMR learns to dynamically pick the right modality for your query, boosting retrieval by 25 nDCG@10 over single modality retrieval! 🧐…
Excited to share Video-Skill-CoT🎬🛠️– a new framework for domain-adaptive video reasoning with skill-aware Chain-of-Thought (CoT) supervision! ⚡️Key Highlights: ➡️ Automatically extracts domain-specific reasoning skills from questions and organizes them into a unified taxonomy,…