Gagan Bansal
@bansalg_
Senior Researcher, Microsoft Research | Human-Agent Interaction | Building AutoGen @pyautogen | Previously UW, IITD
⚡️How can we build *human-centered* agents? Building on our research on AutoGen (@pyautogen) and Magentic-One, our team at @MSFTResearch is very excited to release Magentic-UI. This agentic application has several human-centered considerations built into it, improving -…
🚀 Introducing Magentic-UI — an experimental human-centered web agent from @MSFTResearch . It automates your web tasks while keeping you in control 🧠🤝—through co-planning, co-tasking, action guards, and plan learning. 🔓 Fully open-source. We can't wait for you to try it. 🔗…
Being a developer has never been about how fast we can type code. No, being a developer has always been about our ideas. The code is simply what represents them on the computer. This will never change. My remarks from @WeAreDevs 2025: linkedin.com/pulse/agents-s…
Most of the new features in the latest version of AutoGen were authored by CoPilot!! @ashtom @ekzhu
🚀 AutoGen v0.6.4 is out! Shout-out to @github Copilot for helping author these new features! 🧠 GraphFlow now retains execution state after termination, just like other group chats. Resets only when the graph fully completes. ⚙️ New parameter_override in Workbenches for…
We're doing work to make Magentic-UI more extensible, MCP servers allow you to easily plug in special purpose tools to solve tasks more efficiently.
🚀 Introducing MCP Agents in Magentic-UI! Spin up custom agents that wrap one (or many) MCP tools, and let the Orchestrator pick the best agent for every step of the plan. Check out the demo below to see them in action 👇 #MCP #MagenticUI #AIagents
🚀 Introducing MCP Agents in Magentic-UI! Spin up custom agents that wrap one (or many) MCP tools, and let the Orchestrator pick the best agent for every step of the plan. Check out the demo below to see them in action 👇 #MCP #MagenticUI #AIagents
My recent talk on challenges in developing human-centered agents is now available online! It provides an HCI perspective of our learning from developing @pyautogen youtube.com/watch?v=O5jSX8…
Incredible work from @thekaransinghal, @rahularoradfs, and the teams @OpenAI and @PendaHealth. Carefully designed real-world studies like this one are critical steps towards enabling AI to improve human health. Can't help but feel inspired by the stories and potential here!
📣 Excited to share our real-world study of an LLM clinical copilot, a collab between @OpenAI and @PendaHealth. Across 39,849 live patient visits, clinicians with AI had a 16% relative reduction in diagnostic errors and a 13% reduction in treatment errors vs. those without. 🧵
GitHub Copilot coding agent just got a major upgrade. ✨ What's new: • It tests its own UI changes with Playwright and adds screenshots to PRs. • It can connect to more context and tools with remote MCP support. • You can trigger and track tasks from a new dashboard. • It…
Turn any document into LLM-ready data! Microsoft released MarkItDown a lightweight Python library that converts any document to Markdown for use with LLMs. 100% Open Source
This is really interesting! I'm really surprised by this negative result of AI for coding as it goes against some of my work and the literature so we need to understand why. I found this graph really cool as it mirrors our CUPS work on Copilot with @bansalg_ @erichorvitz…
When AI is allowed, developers spend less time actively coding and searching for information, and instead spend time prompting AI, waiting on/reviewing AI outputs, and idle. We find no single reason for the slowdown—it’s driven by a combination of factors.
Quitting programming as a career right now because of LLMs would be like quitting carpentry as a career thanks to the invention of the table saw.
Turns out, if you teach llamas how to self-reflect and backtrack from wrong reasoning paths, it does extra well on math reasoning! - MATH 500: 65.8% ➡️ 81.8% - AMC 23: 37.5% ➡️ 64.4% - AIME 24: 10% ➡️ 30% Amazing work by @danieljwkim, can be a nice long weekend read!
Can we improve Llama 3’s reasoning abilities through post-training only? Introducing ASTRO, our new framework that teaches LLMs to perform in-context search and generate long CoT to solve math problems, via SFT and RL. Work done at @aiatmeta. 📄 Paper: arxiv.org/abs/2507.00417
Reflections on our recent work on using language models for sequential diagnosis (LinkedIn article). A pleasure collaborating closely with the team to light up this promising, long-term pursuit @MSFTResearch @Microsoft tinyurl.com/mst95586
As some of you may know, I recently moved to London to help lead a new Health AI team! Excited for our first research paper, which demonstrates that AI can tackle medicine’s toughest diagnostic challenges -- at 4x higher accuracy and 20% lower costs than a group of physicians🧵
One of the best decision we've made early on was to NOT use layers of abstraction and go native clients route instead. The second best decision was to use @pyautogen to build all of our workflows on top of. It hits all the right pain points running in distributed environments…
What if LLMs could learn your habits and preferences well enough (across any context!) to anticipate your needs? In a new paper, we present the General User Model (GUM): a model of you built from just your everyday computer use. 🧵
The challenge of achieving complementary performance strikes again! h/t @adamfourney Here's the same exact problem we talked about in the context of XAI a few years. and even with LLM, the same pattern continues to repeat!
New paper shows a familiar result on LLMs & medicine: Doctors given clinical vignettes produce significantly more accurate diagnoses when using a custom GPT built with the (obsolete) GPT-4 than doctors with Google/Pubmed but not AI. Yet AI alone is as accurate as doctors + AI.
Check out this new tutorial about a human centered, web agent that we just released!
👀New Magentic-UI tutorial by @MayaMurad0 Learn how to make the most of MAGUI's features and automate complex tasks.
Crazy that it's been almost a decade since my last internship... Super excited to be at @MSFTResearch this summer! Will hopefully build an awesome new agentic system with @bansalg_ and @HsseinMzannar
Magentic-UI + Ollama We are slowly adding more support for local models in our new open-source, human-centered browser use agent.
We just updated Magentic-UI to better support local models with @ollama ! Here it is reviewing the latest github.com/microsoft/mage… release notes powered by Qwen 2.5 VL 32b!
1K on Magentic-UI ⭐️ Thank you! 🙏 star-history.com/#microsoft/mag… github.com/microsoft/mage… #starhistory #GitHub #OpenSource via @StarHistoryHQ