Shawn Im
@shawnim00
PhD Student @UWMadison | NSF GRFP Fellow | Intern @Apple MLR | Prev @MIT http://shawn-im.github.io
Excited to share that I have received the NSF GRFP!!😀 I'm really grateful to my advisor @SharonYixuanLi for all her support, to @YilunZhou and @jacobandreas, and to everyone else who has guided me through my research journey! #nsfgrfp
Sean @xuefeng_du successfully defended his PhD thesis on “Foundations of Unknown-Aware Machine Learning” today. His PhD work laid the theoretical and algorithmic groundwork for building AI systems that can recognize and reason about the unknown—shaping the field in significant…
✨ My lab will be presenting a series of papers on LLM reliability and safety at #ICML2025—covering topics like hallucination detection, distribution shifts, alignment, and dataset contamination. If you’re attending ICML, please check them out! My students @HyeonggyuC @shawnim00…
Many existing works on advancing multimodal LLMs try to inject MORE information into the model. Would it be the sole/right way to improve the generalization and robustness of MLLMs?
🌍 GeoArena is live! Evaluate how well large vision-language models (LVLMs) understand the world through image geolocalization. Help us compare models via human preference — your feedback matters! 🔗 Try it now: huggingface.co/spaces/garena2… #GeoArena #Geolocation #LVLM #AI
🚨 We’re hiring! The Radio Lab @ NTU Singapore is looking for PhD, master, undergrads, RAs, and interns to build responsible AI & LLMs. Remote/onsite from 2025. Interested? Email us: [email protected] 🔗 d12306.github.io/recru.html Please spread the word if you can!
🚨 If you care about reliable, low-cost LLM hallucination detection, our #ICML2025 paper offers a powerful and data-efficient solution. 💡We introduce TSV: Truthfulness Separator Vector — a single vector injected into a frozen LLM that reshapes its hidden space to better…
Sparse MLPs/dictionaries learn interpretable features in LLMs, yet provide poor layer reconstruction. Mixture of Decoders (MxDs) expand dense layers into sparsely activating sublayers instead, for a more faithful decomposition! 📝 arxiv.org/abs/2505.21364 [1/7]
Does anyone want to dig deeper into the robustness of Multimodal LLMs (MLLMs) beyond empirical observations Happy to serve this exactly through our new #ICML2025 paper "Understanding Multimodal LLMs Under Distribution Shifts: An Information-Theoretic Approach"!
MetaMind: Equip LLM with a "Social Brain" via Metacognitive Multi-agent Framework “What is meant often goes far beyond what is said, and that is what makes conversation possible.” ——H. P. Grice Huggingface: huggingface.co/papers/2505.18…
📢 Looking for new research ideas in AI alignment? Check out our new #ICML2025 position paper: "Challenges and Future Directions of Data-Centric AI Alignment". TL;DR: Aligning powerful AI systems isn't just about better algorithms — it's also about better feedback data, whether…
We have recently released HalluEntity (entity-level hallucination detection benchmark dataset) on HuggingFace: huggingface.co/datasets/samue…
🚀 New Paper Alert! 🚀 📄 Can Your Uncertainty Scores Detect Hallucinated Entity? We explore entity-level hallucination detection and benchmark 5 uncertainty-based detection methods. Paper: arxiv.org/abs/2502.11948 (w/ @KamacheeMax, @seongheon_96, and @SharonYixuanLi) [1/N]
🚀 New Paper Alert! 🚀 📄 Can Your Uncertainty Scores Detect Hallucinated Entity? We explore entity-level hallucination detection and benchmark 5 uncertainty-based detection methods. Paper: arxiv.org/abs/2502.11948 (w/ @KamacheeMax, @seongheon_96, and @SharonYixuanLi) [1/N]
🚨 Paper Alert! 🚨 👀 Is your AI model cheating at test time? 🧮 We introduce Kernel Divergence Score (KDS), a reliable scoring method that quantifies dataset leakage in LLMs. Paper: arxiv.org/abs/2502.00678 (w/ @KhanovMax, @OwenWei8, and @SharonYixuanLi ) [1/N]
How should we assign rewards to intermediate steps in reasoning? DeepSeek-R1 paper highlights it as an open challenge. Here’s everything you need to know about Process Reward Models—the progress, our latest breakthrough Process Q-value Model (PQM), and how it advances @OpenAI's…
📣We are hiring! My group at @WisconsinCS has open positions for Postdoc and PhD starting in Fall 2025. If you're passionate about advancing responsible AI and understanding LLMs/MLLMs from both theoretical and empirical perspectives, we’d love to hear from you. Curious about…