Tanmay Gupta

@tanmay2099

Senior Research Scientist @allen_ai (Ai2) | Developing the science and art of multimodal AI agents | Prev. CS PhD, UIUC and EE UG, IIT Kanpur

Seattle, WA

Joined January 2014

543Following

2KFollowers

Pinned

Tanmay Gupta@tanmay2099 · Feb 26

Love the evolution of this research thread: 2015 - Neural Module Networks (NMN) by @jacobandreas et al. was my introduction to neuro-symbolic reasoning in grad school. Super exciting approach but program synthesis and neural modules were both brittle back then. 2022 - GPT3 and…

ZZaid Khan@codezakh · Feb 26

✨ Introducing MutaGReP (Mutation-guided Grounded Repository Plan Search) - an approach that uses LLM-guided tree search to find realizable plans that are grounded in a target codebase without executing any code! Ever wanted to provide an entire repo containing 100s of 1000s of…

4.0K

Tanmay Gupta@tanmay2099 · Jun 23

This morning took a scenic walk from Ai2’s (@allen_ai) past to its future! Reminds me of this wonderful feeling of day 1 as an intern at a new and shiny office - no idea where anything is anymore! 🤩

tanmay2099's tweet image. This morning took a scenic walk from Ai2’s (@allen_ai) past to its future!

Reminds me of this wonderful feeling of day 1 as an intern at a new and shiny office - no idea where anything is anymore! 🤩

890

Tanmay Gupta@tanmay2099 · Apr 5

Great initiative by #CVPR2025! Kudos to Alyosha and Antonio for volunteering to run these practice sessions 👏👏

716

Tanmay Gupta@tanmay2099 · Feb 26

Loved working with Zaid as he led this exciting project at Ai2! LLM-based coding agents are remarkably capable when given well-grounded plans but generating such plans from arbitrarily large code-bases is extremely challenging to do efficiently, the solution: MutaGReP

ZZaid Khan@codezakh · Feb 26

2.0K

Tanmay Gupta Retweeted

AK@_akhaliq · Feb 26

MutaGReP Execution-Free Repository-Grounded Plan Search for Code-Use

11.0K

Tanmay Gupta Retweeted

Yue Yang@YueYangAI · Feb 24

We share Code-Guided Synthetic Data Generation: using LLM-generated code to create multimodal datasets for text-rich images, such as charts📊, documents📄, etc., to enhance Vision-Language Models. Website: yueyang1996.github.io/cosyn/ Dataset: huggingface.co/datasets/allen… Paper:…

196

133

23.0K

Tanmay Gupta Retweeted

Tanmay Gupta@tanmay2099 · Dec 23

(Thanks for the shoutout @anand_bhattad !) or CodeNav which generalizes tool-use to code-use! Some key improvements upon VisProg / ViperGPT style tool-use systems: ✅ Its way more flexible in how tools are provided (just build a python codebase and point the LLM to that…

562

Tanmay Gupta Retweeted

Jaemin Cho@jmin__cho · Dec 7

🚨 I’m on the 2024-2025 academic job market! j-min.io I work on ✨ Multimodal AI ✨, with a special focus on enhancing reasoning in both understanding and generation tasks by: 1⃣Making it more scalable 2⃣Making it more faithful 3⃣Evaluating and refining multimodal…

218

63.0K

Tanmay Gupta@tanmay2099 · Dec 4

I missed this post back in JULY when Tanmay made it but it's prescient and even more relevant now. Ccore NLP folks, remember not to re-invent the wheel. Agents are a thing in robotics and reinforcement learning and planning. We have algorithms! Come chat with us!

TTanmay Gupta@tanmay2099 · Jul 8, 2024

Do we need to narrowly redefine "Agent" for LLM-Agents or can we just borrow a broader definition from RL / Embodied AI literature? LLM Agents are agentic in the same sense that a trained robot or an RL policy is agentic. Making this connection more explicit allows us to borrow…

3.0K

Tanmay Gupta Retweeted

Unnat Jain@unnatjain2010 · Nov 26

Excited to share that I'll be joining University of California at Irvine as a CS faculty in '25!🌟 Faculty apps: @_krishna_murthy, @liuzhuang1234 & I share our tips: unnat.github.io/notes/Hidden_C… PhD apps: I'm looking for students in vision, robot learning, & AI4Science. Details👇

393

128

65.0K

Tanmay Gupta@tanmay2099 · Nov 26

I am hiring interns to join us @allen_ai in advancing the science and art of building agents of all kinds: 🕸️ Web-use 💻 Code-use 🛠️ Tool-use Join us in answering exciting questions about multimodal planning, agentic learning, dealing with underspecified queries and more!

PPrior @ AI2@Ai2Prior · Nov 25

📢Applications are open for summer'25 internships at the PRIOR (computer vision) team @allen_ai: Come join us in building large-scale models for: 📸 Open-source Vision-Language Models 💻 Multimodal Web Agents 🤖 Embodied AI + Robotics 🌎 Planet Monitoring Apply by December…

4.0K

Tanmay Gupta@tanmay2099 · Nov 8

We won the outstanding paper award @corl_conf !!! 😀😀😀And here’s what’s inside that mysterious big box

KKuo-Hao Zeng@KuoHaoZeng · Nov 8

🚀 Quick Update 🚀 🎉 @ZCCZHANG will present PoliFormer at CoRL Oral Session 5 (🕤 9:30-10:30, Fri, Nov 8, CET)! 🎉 Meet us at Poster Session 4 (🕓 16:00-17:30) to chat with @ZCCZHANG, @rosemhendrix, and Jordi! 💻 Our code & checkpoints are NOW public: github.com/allenai/polifo…

4.0K

Tanmay Gupta@tanmay2099 · Nov 8

Incredibly honored to share this amazing news! PoliFormer has won the Outstanding Paper Award at @corl_conf 2024! 🎉 Check out our project and code: poliformer.allen.ai

AAni Kembhavi@anikembhavi · Nov 8

PoliFormer has won the Outstanding Paper Award at @corl_conf 2024! On policy RL with a modern transformer architecture can produce masterful navigators for multiple embodiments. All Sim-to-Real. A last hurrah from work at @allen_ai ! Led by @KuoHaoZeng @ZCCZHANG and @LucaWeihs

15.0K

Tanmay Gupta@tanmay2099 · Sep 25

This is how we do POS tagging in 2024, right? Jokes aside, the model is actually really good at pointing. Check it out yourself!

AAi2@allen_ai · Sep 25

Meet Molmo: a family of open, state-of-the-art multimodal AI models. Our best model outperforms proprietary systems, using 1000x less data. Molmo doesn't just understand multimodal data—it acts on it, enabling rich interactions in both the physical and virtual worlds. Try it…

6.0K