Vipul Gupta

@vipul_1011

PhD Candidate @Penn_State. Past: FAIR @AIatMeta, @IITDelhi. Interested in model evaluation and responsible AI. I don’t hallucinate

United States

Joined August 2020

893Following

1KFollowers

Pinned

Vipul Gupta@vipul_1011 · Oct 30

🚨 New paper alert 🚨 Ever struggled with quick saturation or unreliability in benchmark datasets? Introducing SMART Filtering to select high-quality, reducing dataset size by 48% on avg (up to 68% for ARC!) and improving correlation with scores from ChatBot Arena! 📈✨ (1/N)

vipul_1011's tweet image. 🚨 New paper alert 🚨
Ever struggled with quick saturation or unreliability in benchmark datasets?

Introducing SMART Filtering to select high-quality, reducing dataset size by 48% on avg (up to 68% for ARC!) and improving correlation with scores from ChatBot Arena! 📈✨

(1/N)

24.0K

Vipul Gupta@vipul_1011 · Jun 11

Thank you! I am glad you could attend

KKenneth Huang@windx0303 · Jun 10

The amazing @vipul_1011 from PSU NLP is defending his PhD dissertation today!

2.0K

Vipul Gupta@vipul_1011 · Jun 3

Optimistic or Pessimistic about the future of AI? - "It doesn't matter. What matters is what each of us can do to improve things towards a better world" - Yoshua The best take I have seen!

YYoshua Bengio@Yoshua_Bengio · Jun 3

Today marks a big milestone for me. I'm launching @LawZero_, a nonprofit focusing on a new safe-by-design approach to AI that could both accelerate scientific discovery and provide a safeguard against the dangers of agentic AI.

418

Vipul Gupta@vipul_1011 · May 30

I can’t unsee it now, ChatGPT can sing better than me (it was a very low bar though)

sswyx@swyx · May 30

just found out from @altryne's show that @chatgptapp advanced voice upgraded singing capability this week so here is karaokebench would say it is like 3/7 so far are there other sota singing models?

386

Vipul Gupta Retweeted

Benjamin Thérien@benjamintherien · May 30

Is AdamW the best inner optimizer for DiLoCo? Does the inner optimizer affect the compressibility of the DiLoCo delta? Excited to introduce MuLoCo: Muon is a practical inner optimizer for DiLoCo! 🧵arxiv.org/abs/2505.23725 1/N

7.0K

Vipul Gupta@vipul_1011 · May 28

There's a different sense of satisfaction in reading research papers from 1900s, especially before 1980s

379

Vipul Gupta Retweeted

Yoshua Bengio@Yoshua_Bengio · May 27

When I realized how dangerous the current agency-driven AI trajectory could be for future generations, I knew I had to do all I could to make AI safer. I recently shared this personal experience, and outlined the scientific solution I envision @TEDTalks⤵️ ted.com/talks/yoshua_b…

357

108

23.0K

Vipul Gupta@vipul_1011 · May 26

Imagine a world without AI, where the amount of info on internet doubles every few yrs. Compound it to 10x in 5-10 yrs, 100x info in 15-20 yrs. How are we even supposed to navigate all this info? We really need a tool that can get us the info we need. Reason I love working in AI

VVipul Gupta@vipul_1011 · May 26

Who has ever read every single line of T&C? AI isn't here to replace humans but to simplify some unnecessarily complex things (which we created) and free up our time to do fun stuff. In a world where many of us have a never-ending to-do list, AI is the change we needed

164

Vipul Gupta@vipul_1011 · May 26

450

Vipul Gupta@vipul_1011 · May 26

Saving logs for all experiments in an organized way saves days/weeks of effort

768

Vipul Gupta Retweeted

Ryo Kamoi@RyoKamoi · May 23

📢 New paper! FoVer enhances PRMs for step-level verification of LLM reasoning w/o human annotation 🚀 We synthesize training data using formal verification tools and improve LLMs at step-level verification of LLM responses on MATH, AIME, MMLU, BBH, etc. arxiv.org/abs/2505.15960

127

31.0K

Vipul Gupta@vipul_1011 · May 21

Despite it flaws, I think lmarena is currently the best option we have for evaluation However, am I the only one who sees a big problem by having VCs for such a crowdsource benchmarking? What happens when another startup heavily funded by same VCs lags in the ranking?

llmarena.ai@lmarena_ai · May 21

📢We’re excited to share that we’ve raised $100M in seed funding to support LMArena and continue our research on reliable AI. Led by @a16z and UC Investments (@UofCalifornia), we're proud to have the support of those that believe in both the science and the mission. We’re…

517

Vipul Gupta Retweeted

Pranav Venkit, PhD@PranavVenkit · May 20

A lil late to post but... I’ve officially completed my PhD in Informatics from Penn State! 🎉🎉🎉 My thesis, “Society and Bias: Uncovering Automated Prejudices in Sociotechnical NLP Systems”, explores biases in human language technologies. #PhD #AI #Research #AcademicJourney

6.0K

Vipul Gupta@vipul_1011 · May 9

#NAACL takeaways: - Love seeing so many people working on evals, optimistic about new breakthroughs - Orals not having posters felt off - The size of the conference was perfect (someone said ~1800): intimate but still big enough

vipul_1011's tweet image. #NAACL takeaways:
- Love seeing so many people working on evals, optimistic about new breakthroughs
- Orals not having posters felt off
- The size of the conference was perfect (someone said ~1800): intimate but still big enough

561

Vipul Gupta@vipul_1011 · Apr 30

🎉Our paper on fairness of multidoc summarization has received an SAC award at NAACL 2025! 🥳 We appreciate the recognition from senior area chairs. @HaoyuanLi9 and @YusenZhangNLP will present our work: Posters (Exhibit Hall), Session H: Oral/Poster 5, Thursday May 1,…

NNAACL HLT 2025@naaclmeeting · Apr 25

🟢 Announcing the #NAACL2025 Award Winners! The Best Paper and Best Theme Paper winners will present at our closing session 2025.naacl.org/blog/best-pape…

2.0K