Siddharth Suresh

@siddsuresh97

ML Research Intern @NetflixResearch | PhD student @UWMadison | Human-AI Alignment | Prev Applied Scientist Intern @AmazonAGI, Intern @BrownCLPS

Madison, WI

Joined October 2016

2KFollowing

405Followers

Pinned

Siddharth Suresh@siddsuresh97 · Dec 7, 2023

#EMNLP23 paper alert 🚨‼️🔔 1/9 LLMs are now widely used, fostering increased collaboration between humans and AI. But are LLMs and humans aligned on how they represent object concepts? And do LLMs represent concepts as robustly as humans?

siddsuresh97's tweet image. #EMNLP23 paper alert 🚨‼️🔔
1/9 LLMs are now widely used, fostering increased collaboration between humans and AI. But are LLMs and humans aligned on how they represent object concepts? And do LLMs represent concepts as robustly as humans?

10.0K

Pinned

Siddharth Suresh Retweeted

Andrew Lampinen@AndrewLampinen · May 2

How do language models generalize from information they learn in-context vs. via finetuning? We show that in-context learning can generalize more flexibly, illustrating key differences in the inductive biases of these modes of learning — and ways to improve finetuning. Thread: 1/

151

765

690

97.0K

Pinned

Siddharth Suresh@siddsuresh97 · May 1

Now accepted at #icml2025 !

BBen Lonnqvist@lonnqvistben · Apr 22

AI vision is insanely good nowadays—but is it really like human vision or something else entirely? In our new pre-print, we pinpoint a fundamental visual mechanism that's trivial for humans yet causes most models to fail spectacularly. Let's dive in👇🧠 [arxiv.org/abs/2504.05253]

1.0K

Pinned

Siddharth Suresh Retweeted

Andrew Lampinen@AndrewLampinen · May 1

Had fun talking at the @SCSLWorkshop at ICLR! One example I brought up, which I think provides an uncommon perspective: a case where spurious shortcuts can improve generalization... even to out-of-distribution sets where the spurious feature doesn't generalize! Thread:

10.0K

Siddharth Suresh Retweeted

Cameron R. Wolfe, Ph.D.@cwolferesearch · 12 h

Direct Preference Optimization (DPO) is simple to implement but complex to understand, which creates misconceptions about how it actually works… LLM Training Stages: LLMs are typically trained in four stages: 1. Pretraining 2. Supervised Finetuning (SFT) 3. Reinforcement…

113

5.0K

Siddharth Suresh Retweeted

lalit@stochasticlalit · Jul 21

It was amazing to be part of this effort. Huge shout out to the team, and all the incredible pre-training and post-training efforts that ensure Gemini is the leading frontier model! deepmind.google/discover/blog/…

3.0K

Siddharth Suresh Retweeted

Nicholas Roberts@nick11roberts · Jul 14

🎉 Excited to share that our paper "Pretrained Hybrids with MAD Skills" was accepted to @COLM_conf 2025! We introduce Manticore - a framework for automatically creating hybrid LMs from pretrained models without training from scratch. 🧵[1/n]

6.0K

Siddharth Suresh@siddsuresh97 · Jul 12

🧠 Submit to CogInterp @ NeurIPS 2025! Bridging AI & cognitive science to understand how models think, reason & represent. CFP + details 👉 coginterp.github.io/neurips2025/

CCogInterp Workshop @ NeurIPS 2025@CogInterp · Jul 11

We’re excited to announce the first workshop on CogInterp: Interpreting Cognition in Deep Learning Models @ NeurIPS 2025! 📣 How can we interpret the algorithms and representations underlying complex behavior in deep learning models? 🌐 coginterp.github.io/neurips2025/ 1/

3.0K

Siddharth Suresh@siddsuresh97 · Jul 11

Releasing HumorBench today. Grok 4 is🥇 on this uncontaminated, non-STEM humor reasoning benchmark. 🫡🫡@xai Here are couple things I find surprising👇 1. this benchmark yields an almost perfect rank correlation with ARC-AGI. Yet the task of reasoning about New Yorker style…

RReuben Narad@ReubenNarad · Jul 11

Whoa... Grok 4 beats o3 on our never-released benchmark: HumorBench, a non-STEM reasoning benchmark that measures humor comprehension. The task is simple: given a New Yorker Caption Contest cartoon and caption, explain the joke.

2.0K

Siddharth Suresh Retweeted

Fenil Doshi@fenildoshi009 · Jul 2

🧵 What if two images have the same local parts but represent different global shapes purely through part arrangement? Humans can spot the difference instantly! The question is can vision models do the same? 1/15

111

590

418

60.0K

Siddharth Suresh Retweeted

Qihong Lu | 吕其鸿@Qihong_Lu · May 8

I’m thrilled to announce that I will start as a presidential assistant professor in Neuroscience at the City U of Hong Kong in Jan 2026! I have RA, PhD, and postdoc positions available! Come work with me on neural network models/experiments on human memory! RT appreciated! (1/5)

123

11.0K

Siddharth Suresh@siddsuresh97 · May 1

Do you want to do RL for coding and agentic workflows? Do you want to do science, and figure out when RL kicks in? What is the right algorithm (it's not GRPO)? how much reasoning you need in your base (you def need some! but is it a lot or A LOT)? Do you want to figure out how…

DDimitris Papailiopoulos@DimitrisPapail · May 1

We’ve been cooking... a new open weights 14B Phi-4 reasoning model, SFT’d on ~1.4M carefully curated reasoning demonstrations from o3-mini and RL’d for a tiny bit. This model is a little beast.

207

129

28.0K

Siddharth Suresh@siddsuresh97 · Apr 29

We are recruiting postdocs! Want to grow your own social networks to study creativity, cultural evolution & decision-making? We are hiring a funded postdoc at Cornell in collaboration with UC Davis, CUNY, & Princeton. Apply here: academicjobsonline.org/ajo/jobs/28959

NNori Jacoby@norijacoby · Apr 29

We are recruiting two postdoctoral scholars for a research project in human collective intelligence and creativity at UC Davis and Cornell. Joint project with @enfascination,@norijacoby,@daltonconley, & Ofer Tchernichovski. Please forward this thread to relevant people. 1/n

827