Omar Shaikh

@oshaikh13

CS Ph.D. student @Stanford - previously @GeorgiaTech

🇸🇦→🇨🇦→🇺🇸→🇸🇦→🇺🇸

Joined December 2012

826Following

1KFollowers

Pinned

Omar Shaikh@oshaikh13 · Jun 9

What if LLMs could learn your habits and preferences well enough (across any context!) to anticipate your needs? In a new paper, we present the General User Model (GUM): a model of you built from just your everyday computer use. 🧵

336

197

57.0K

Omar Shaikh Retweeted

Christina Ma@1000seagull · 21 h

🤖 Evaluating Human-AI Systems? Time to raise the bar. Check out SPHERE: An Evaluation Card for Human-AI Systems at ACL 2025 poster! 🗓️ July 28 18:00 📍 Hall X4/X5 🔗 sphere-eval.github.io Let’s talk transparent, rigorous, and human-centric evaluation! #ACL2025NLP #humanai

1.0K

Omar Shaikh@oshaikh13 · Jul 17

bro is cheating on this too💀

RRoy@im_roy_lee · Jul 15

in the past week at @cluely, we've been kicking off our most ambitious project ever. the models of today are great at answering questions. the models at @cluely will be really good at predicting which questions you have. this is a fundamentally different user experience than…

7.0K

Omar Shaikh Retweeted

Neil Rathi@neil_rathi · Jul 10

new paper 🌟 interpretation of uncertainty expressions like "i think" differs cross-linguistically. we show that (1) llms are sensitive to these differences but (2) humans overrely on their outputs across languages

10.0K

Omar Shaikh Retweeted

Jessy Lin@realJessyLin · Jul 10

User simulators bridge RL with real-world interaction // jessylin.com/2025/07/10/use… How do we get the RL paradigm to work on tasks beyond math & code? Instead of designing datasets, RL requires designing environments. Given that most non-trivial real-world tasks involve…

329

279

32.0K

Omar Shaikh Retweeted

Zora Wang@ZhiruoW · Jul 10

If you're attending #ICML2025, check out our 💭 Agent Workflow Memory for online adaptive agents: Jul 17 4:30-7pm @ West Hall 🔎 RAGGED for designing scalable and stable RAG systems: Jul 16 11:00-13:30 @ East Hall Computer Use Agent Workshop on Jul 19 🌐 "Universal Retrieval for…

5.0K

Omar Shaikh Retweeted

Judy Shen@judyhshen · Jul 8

Job seekers are using LLMs to boost their resumes. Are companies interviewing the best candidates ... or just the candidates using the best LLM? 🧐 Our new ICML paper presents a fair and accurate hiring algorithm under stochastic manipulations: 📄 arxiv.org/abs/2502.13221 🧵 1/5

100

10.0K

Omar Shaikh Retweeted

Justin Cho 조현동@HJCH0 · Jul 7

Can you tell what actions are being mimed in this video? If so, you’re smarter than AI models! Check the last tweet in this thread for answers. In a new paper, we present MIME, which evaluates whether vision language models (VLMs) have a robust understanding of human actions. 🧵

3.0K

Omar Shaikh Retweeted

Michael Bernstein@msbernst · Jul 7

Thank you to everyone for your energy and enthusiasm in joining this adventure with me so far!

765

43.0K

Omar Shaikh Retweeted

jessica dai@jessicadai_ · Jul 1

individual reporting for post-deployment evals — a little manifesto (& new preprints!) tldr: end users have unique insights about how deployed systems are failing; we should figure out how to translate their experiences into formal evaluations of those systems.

134

27.0K

Omar Shaikh Retweeted

Science of Science @ AOM@MishaTeplitskiy · Jul 1

Men are much more likely to self-promote their papers on Twitter/X than women

3.0K

Omar Shaikh Retweeted

CLS@ChengleiSi · Jun 30

Are AI scientists already better than human researchers? We recruited 43 PhD students to spend 3 months executing research ideas proposed by an LLM agent vs human experts. Main finding: LLM ideas result in worse projects than human ideas.

170

597

204

138.0K

Omar Shaikh Retweeted

jxmo@jxmnop · Jun 29

seems big AI labs are hyperfixating on reasoning when they should focus on *memory* instead normal people won't use models that can think for hours to solve hard math problems people want models that learn over time, remember details, adapt and interact like a person would

106

1.0K

212

72.0K

Omar Shaikh Retweeted

Science of Science @ AOM@MishaTeplitskiy · Jun 27

Verrrrry intriguing-looking and labor-intensive test of whether LLMs can come up with good scientific ideas. After implementing those ideas, the verdict seems to be "no, not really."

106

680

530

109.0K

Omar Shaikh Retweeted

Moussa Koulako Bala Doumbouya@mdoumbouya · Jun 18

New paper: What if neural networks assessed similarity like humans? We introduce Tversky Neural Networks, based on Tversky's (1977) psychological theory of similarity. These models enable efficient, interpretable, and psychologically plausible deep learning. (1/8)

3.0K

Omar Shaikh Retweeted

Yutong Zhang@zhangyt0704 · Jun 18

AI companions aren’t science fiction anymore 🤖💬❤️ Thousands are turning to AI chatbots for emotional connection – finding comfort, sharing secrets, and even falling in love. But as AI companionship grows, the line between real and artificial relationships blurs. 📰 “Can A.I.…

190

139

49.0K

Omar Shaikh Retweeted

Sherry Yang@sherryyangML · Jun 11

Evaluating policies on a real robot can be painful. Can we use a world model to get a rough estimate of how good a policy is? Checkout "Evaluating Robot Policies in a World Model". Paper: arxiv.org/abs/2506.00613 Demo: world-model-eval.github.io Code: github.com/world-model-ev…

7.0K

Omar Shaikh Retweeted

evanthebouncy@evanthebouncy · Jun 13

new multi-turn instruction grounding dataset with @wp_mccarthy and @saujasv - multi-modal instruction : drawing + txt - verifiable execution : 2D CAD gym env - easy eval : API → score - baselines : human vs VLMs - large : 15,163 inst-exe rounds github.com/AutodeskAILab/… [1/n]

3.0K

Omar Shaikh Retweeted

Sarah Wiegreffe @ ICML@sarahwiegreffe · Jun 13

A bit late to announce, but I’m excited to share that I'll be starting as an assistant professor at the University of Maryland @umdcs this August. I'll be recruiting PhD students this upcoming cycle for fall 2026. (And if you're a UMD grad student, sign up for my fall seminar!)

608

41.0K

Omar Shaikh@oshaikh13 · Jun 12

This is THE paper to share with friends and family who want a realistic perspective on how AI will affect their careers. Banger from @EchoShao8899!!!

YYijia Shao@EchoShao8899 · Jun 12

🚨 70 million US workers are about to face their biggest workplace transmission due to AI agents. But nobody asks them what they want. While AI races to automate everything, we took a different approach: auditing what workers want vs. what AI can do across the US workforce.🧵

4.0K

Omar Shaikh Retweeted

DSPy@DSPyOSS · Jun 10

this time, by none other than MIPRO's co-creator @michaelryan207 himself

839