Jean de Nyandwi

@Jeande_d

Visiting Researcher @LTIatCMU • Vision 🤍 Language, Multimodal Research • CMU Research blog: http://deeprevision.github.io ML: http://nyandwi.com/machine_learning_complete

Pittsburgh, PA

Joined March 2017

1KFollowing

45KFollowers

Pinned

Jean de Nyandwi@Jeande_d · Jul 29, 2023

A NEW ARTICLE 🔥🔥 The first article of Deep Learning Revision Research Blog(introduced recently) is out: "The Transformer Blueprint: A Holistic Guide to the Transformer Neural Network Architecture". In the article, we discuss the core mechanics of transformer neural…

Jeande_d's tweet image. A NEW ARTICLE 🔥🔥

The first article of Deep Learning Revision Research Blog(introduced recently) is out: "The Transformer Blueprint: A Holistic Guide to the Transformer Neural Network Architecture".

In the article, we discuss the core mechanics of transformer neural…

181

672

563

129.0K

Jean de Nyandwi Retweeted

Xiang Yue@xiangyue96 · Jul 2

People are racing to push math reasoning performance in #LLMs—but have we really asked why? The common assumption is that improving math reasoning should transfer to broader capabilities in other domains. But is that actually true? In our study (arxiv.org/pdf/2507.00432), we…

127

611

399

58.0K

Jean de Nyandwi Retweeted

Graham Neubig@gneubig · Jun 30

What will software development look like in 2026? With coding agents rapidly improving, dev roles may look quite different. My current workflow has changed a lot: - Work in github, not IDEs - Agents in parallel - Write English, not code - More code review Thoughts + a video👇

121

17.0K

Jean de Nyandwi@Jeande_d · Jun 17

Many people have been asking for an interface to OpenHands that is: 1. easy to install (no docker) 2. can be used in your standard development environment This new CLI checks both of these boxes, and is fun to use!

AAll Hands AI@allhands_ai · Jun 17

Introducing the OpenHands CLI, a new coding CLI that: - Has top accuracy (similar to Claude Code) - Is completely open source, MIT licensed - Is model agnostic, use an API or bring your own - Is simple to install and run `pip install openhands-ai` and `openhands` (no Docker!)

7.0K

Jean de Nyandwi Retweeted

Apurva Gandhi@apurvasgandhi · Jun 6

New preprint on web agents🚨 Go-Browse: Training Web Agents with Structured Exploration Problem: LLMs lack prior understanding of the websites that web agents will be deployed on. Solution: Go-Browse is an unsupervised method for automatically collecting diverse and realistic…

6.0K

Jean de Nyandwi@Jeande_d · May 28

We've added o4-mini, o3, Claude-4-Sonnet, Gemini-2.5-Flash to our VisualPuzzles leaderboard! 📊 Despite recent advances, even the strongest models still fall short of the lower 5th percentile human reasoning performance (57.5% accuracy). Still a long way to go on multimodal…

YYueqi Song@yueqi_song · Apr 16

Humans can perform complex reasoning without relying on specific domain knowledge, but can multimodal models truly do that as well? Short answer: No. Even the best models perform below the 5th-percentile human on our VisualPuzzles tasks. 🚀 Introducing VisualPuzzles🧩: a new…

15.0K

Jean de Nyandwi Retweeted

Graham Neubig@gneubig · May 1

How can we vibe code while still maintaining code quality? Over the past year, I've shifted 95% of my development from manually writing code to using coding agents. I wrote this blog on some tricks I learned to work successfully with agents: all-hands.dev/blog/vibe-codi…

180

184

66.0K

Jean de Nyandwi Retweeted

Zhiqiu Lin@ZhiqiuLin · Apr 28

📷 Can AI understand camera motion like a cinematographer? Meet CameraBench: a large-scale, expert-annotated dataset for understanding camera motion geometry (e.g., trajectories) and semantics (e.g., scene contexts) in any video – films, games, drone shots, vlogs, etc. Links…

188

105

26.0K

Jean de Nyandwi Retweeted

Nando de Freitas@NandoDF · Apr 24

x.com/i/article/1915…

277

260

39.0K

Jean de Nyandwi@Jeande_d · Apr 21

Fresh GPT‑o3 results on our vision‑centric #NaturalBench (NeurIPS’24) benchmark! 🎯 Its new visual chain‑of‑thought—by “zooming in” on details—cracks questions that still stump GPT‑4o. Yet vision reasoning isn’t solved: o3 can still hallucinate even after a full minute of…

ZZhiqiu Lin@ZhiqiuLin · Oct 21

🚀 Make Vision Matter in Visual-Question-Answering (VQA)! Introducing NaturalBench, a vision-centric VQA benchmark (NeurIPS'24) that challenges vision-language models with pairs of simple questions about natural imagery. 🌍📸 Here’s what we found after testing 53 models…

108

21.0K

Jean de Nyandwi Retweeted

Yueqi Song@yueqi_song · Apr 16

238

140

62.0K

Jean de Nyandwi Retweeted

Graham Neubig@gneubig · Apr 9

A big two days of agents starting tomorrow at CMU (and then two days of agent hackathon after that!) Registration is still open so if you're in or around Pittsburgh come one come all: cmu-agent-workshop.github.io We also plan to livestream for participants who can't make it in person

186

17.0K

Jean de Nyandwi@Jeande_d · Apr 2

People at Artificial Analysis are doing good job on comparing models across important axes: intelligence, speed, and price. And for all modalities and tasks: language, speech, video, image, code. Benchmarks are not perfect but there is no other way to know a model is better than…

AArtificial Analysis@ArtificialAnlys · Mar 31

OpenAI’s GPT-4o Image Generation debuts with an ELO score in equal first-place in the Artificial Analysis Image Arena, outperforming Recraft V3, FLUX 1.1 [pro] and Gemini 2.0 Flash @OpenAI last week launched GPT-4o Image Generation, upgrading ChatGPT’s built-in image generation…

1.0K

Jean de Nyandwi@Jeande_d · Mar 31

Today's a big day! Months of work went into both of these releases, so we hope people enjoy them. OpenHands is now a great coding agent that you can run entirely locally (w/ OpenHands LM), and a great coding agent that you can run anywhere (w/ OpenHands Cloud).

AAll Hands AI@allhands_ai · Mar 31

Today, we're excited to make two big announcements! - OpenHands LM: The strongest 32B coding agent model, resolving 37.4% of issues on SWE-bench Verified 📈 - OpenHands Cloud: SOTA open-source coding agents from your computer, phone, github, with $50 in free credits 🙌☁️

161

16.0K