Nishant Balepur
@NishantBalepur
Intern @allen_ai and CS PhD @UofMaryland. Trying to find that dog in me. Thinking about helpfulness + evals with #LLMs. Prev @adobe
I'm now a Ph.D. candidate! 🎉🥳 A few weeks ago, I proposed my thesis: "Teaching AI to Answer Questions with Reasoning that Actually Helps You". Thanks to my amazing committee + friends @ClipUmd! 🫶 I won't be back in Maryland for a while, some exciting things coming soon 👀


New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵
One of the kindest, smartest, and most thoughtful researchers I've ever worked with Apply to work with Abhilasha!
Life update: I’m excited to share that I’ll be starting as faculty at the Max Planck Institute for Software Systems(@mpi_sws_) this Fall!🎉 I’ll be recruiting PhD students in the upcoming cycle, as well as research interns throughout the year: lasharavichander.github.io/contact.html
1/ 🚨New Paper 🚨 LLMs are trained to refuse harmful instructions, but internally, do they see harmfulness and refusal as the same? ⚔️We find causal evidence that 👈”LLMs encode harmfulness and refusal separately” 👉. ✂️LLMs may know a prompt is harmful internally yet still…
To be presented at ACL 2025: Large Language Models Are Biased Because They Are Large Language Models. Article: doi.org/10.1162/coli_a… Short (8min) video: youtube.com/watch?v=WLSuhe… #ACL2025NLP #NLProc #LLMs
We are hosting a student researcher this year at the Paradigms of Intelligence team at Google! Interested in working with @ninoscherrer and me on AGI, or whatever you think is the next big thing 🥰, please consider applying! docs.google.com/forms/u/2/d/e/…
Hello LauraNation🐛 !!! Attending #ICML2025 ? Curious about data augmentation approaches? Come check out my poster! 🕚 Time: 7/17, 11am-1:30pm 🏙️ Location: W-306 In simpler terms, we choose "equally difficult" augmentations with a quick estimation of model sensitivity.
Poster is tomorrow 11-*1:30 I am also recruiting PhD students @umdcs for fall 2026 with interests in (causal/mechanistic) LM interpretability and its practical applications (steering, efficient adaptation, model editing, textual explanations for users, etc.)
I am at #ICML2025! 🇨🇦🏞️ Catch me: 1️⃣ Today at the @WiMLworkshop mentoring roundtables (1-2pm in W211-214) 2️⃣ Presenting this paper👇 tomorrow 11-11:30 at East #1205 3️⃣ At the Actionable Interpretability @ActInterp workshop on Saturday in East Ballroom A (I’m an organizer!)
Lots of progress in mech interp (MI) lately! But how can we measure when new mech interp methods yield real improvements over prior work? We propose 😎 𝗠𝗜𝗕: a Mechanistic Interpretability Benchmark!
We’ve upgraded ScholarQA, our agent that helps researchers conduct literature reviews efficiently by providing detailed answers. Now, when ScholarQA cites a source, it won’t just tell you which paper it came from–you’ll see the exact quote, highlighted in the original PDF. 🧵
You have a budget to human-evaluate 100 inputs to your models, but your dataset is 10,000 inputs. Do not just pick 100 randomly!🙅 We can do better. "How to Select Datapoints for Efficient Human Evaluation of NLG Models?" shows how.🕵️ (random is still a devilishly good baseline)
🏆 Join us for our oral presentation today in West Ballroom A at 3:30 PM, and the poster session at 4:30 PM in East Exhb. Hall A-B # E-500! Sad that @mbodhisattwa and I could not travel to @icmlconf, thanks @TuhinChakr for representing! Feel free to send comments our way.
Very excited for a new #ICML2025 position paper accepted as oral w @mbodhisattwa & @TuhinChakr! 😎 What are the longitudinal harms of AI development? We use economic theories to highlight AI’s intertemporal impacts on livelihoods & its role in deepening labor-market inequality.
At #ICML2025 till Sunday! Love to chat about #interpretability, understanding model internals, and finding yummy vegan food in Vancouver 🥬🍜
Presenting today at #SIGIR2025! Are Generative AI Agents Effective Personalized Financial Advisors? w/ @JavierSanzCruza @richardm_ @iadh If you’re into 💬 Conversational Agents, 👥 Human Evaluation, 💰 Finance & AI, come find us at Domain-Specific Applications @ Palladio!
🤔 How do we train AI models that surpass their teachers? 🚨 In #COLM2025: ✨Delta learning ✨makes LLM post-training cheap and easy – with only weak data, we beat open 8B SOTA 🤯 The secret? Learn from the *differences* in weak data pairs! 📜 arxiv.org/abs/2507.06187 🧵 below
CLIPPER has been accepted to #COLM2025! In this work, we introduce a compression-based pipeline to generate synthetic data for long-context narrative reasoning tasks. Excited to be in Montreal this October🍁
⚠️ Current methods for generating instruction-following data fall short for long-range reasoning tasks like narrative claim verification. We present CLIPPER✂️, a compression-based pipeline that produces grounded instructions for ~$0.5 each, 34x cheaper than human annotations.
I should probably announce that a few months ago, I joined @scale_AI to lead the Safety, Evaluations, and Alignment Lab… and today, I joined @Meta to continue working on AI alignment with @summeryue0 and @alexandr_wang. Very excited for what we can accomplish together!
✨Hi everyone! We are running a user study on a physical activity coaching chatbot to help boost your STEPCOUNT 💪💪💪! This study will span 4 - 6 weeks. Please sign up if you are interested! forms.gle/K9wgvcL8jBsKTf…
Introducing IFBench, a benchmark to measure how well AI models follow new, challenging, and diverse verifiable instructions. Top models like Gemini 2.5 Pro or Claude 4 Sonnet are only able to score up to 50%, presenting an open frontier for post-training. 🧵