Aditya Soni

@Aditya_Soni_8

MS Student @LTIatCMU Previously Bachelor's in Computer Science @IITKgp

Joined January 2025

125Following

59Followers

Pinned

Aditya Soni@Aditya_Soni_8 · Jun 4

Can we design AI Agents that achieve generalizability across diverse task domains? Our new paper introduces OpenHands-Versa, a generalist agent with strong performance on three challenging agent benchmarks, ranking #1 on SWE-Bench Multimodal and The Agent Company leaderboards 🚀

Aditya_Soni_8's tweet image. Can we design AI Agents that achieve generalizability across diverse task domains?

Our new paper introduces OpenHands-Versa, a generalist agent with strong performance on three challenging agent benchmarks, ranking #1 on SWE-Bench Multimodal and The Agent Company leaderboards 🚀

20.0K

Pinned

Aditya Soni@Aditya_Soni_8 · Jul 21

One less-known feature of OpenHands is that it allows you to spin up a frontend, and then have the agent test out the frontend to make sure that it works! You can see a video demo here: youtu.be/jMyTCXpEz10

HHengbin Fang@HengbinF10584 · Jul 20

@allhands_ai I love you guys. This is such an amazing product. I've never had an AI that managed to do VISUAL TESTING TOO!!!!!!! Cursor is only textual.

6.0K

Aditya Soni@Aditya_Soni_8 · Jul 21

Proud and happy to see OpenAgentSafety coming out! Further pushing the frontier of interactional safety risks in human-AI agent collaboration. Kudos to @sanidhya903 and @Aditya_Soni_8 who led the projects!

SSanidhya Vijayvargiya@sanidhya903 · Jul 15

1/ AI agents are increasingly being deployed for real-world tasks, but how safe are they in high-stakes settings? 🚨 NEW: OpenAgentSafety - A comprehensive framework for evaluating AI agent safety in realistic scenarios across eight critical risk categories. 🧵

1.0K

Aditya Soni@Aditya_Soni_8 · Jul 19

Stop by the poster sessions today at ICML Workshop on Computer Use Agents to chat about OpenHands-Versa!

AAditya Soni@Aditya_Soni_8 · Jun 4

3.0K

Aditya Soni@Aditya_Soni_8 · Jun 15

Excited about the results! OpenHands-Versa ranks #1 both in terms of accuracy and cost 🚀 The cost savings are primarily due to context condensation in OpenHands-Versa: it suffices to retain the most recent browsing observation instead of all previous browsing observations.

GGraham Neubig@gneubig · Jun 14

We just updated the leaderboard of TheAgentCompany, a benchmark of tasks like real-world work. - In December 2024, 24% of the tasks could be solved - In June 2025, 33% of the tasks could be solved I'm interested to see when we'll be at 50%.

418

Aditya Soni Retweeted

elvis@omarsar0 · Jun 4

Coding Agents 🤝 Multimodal Browsing Can AI agents generalize beyond their intended scope? Great paper on how you can build generalist agents with superior performance over specialized agents. What models and tools work the best? Here are my notes:

211

161

22.0K