Stephen Bach
@stevebach
Asst. prof. @BrownCSDept. Working on improving how humans teach computers. Weak supervision, zero-shot learning, few-shot learning, and high-level knowledge.
What I look for when hiring? EXTREME PARANOIA about code and data
Not all benchmarks are created equal. We built a PhD-level multiple-choice test across 1,000+ subdomains, STEM, humanities, pro fields. Top LLMs? Scored <20%. This is what it takes to test advanced reasoning. Built with Snorkel’s Expert Data-as-a-Service. #LLM #GenAI
An under appreciated fact about using formal methods like Lean is that it enables large-scale *collaboration* among mathematicians & potentially future AI agents. Why? Well, you can decompose a large proof into separate components that can be proven independently with robust…
Thanks @lateinteraction ! Every time I think about the gazillion prompt / systems engineering tweaks that also go into making an AI system work I think about how early you were with @DSPyOSS :) Shared theme: find the key human input and make it programmatic.
Every time I think about what it takes to systematically organize the gazillion training tasks that together make a great foundation model, my appreciation for how early @SnorkelAI was increases.
America’s innovative edge makes us great—tell Congress: ProtectScienceAndInnovation.org Check out (and help!) push this nonpartisan campaign for investing in our most critical national edge! #ProtectScience #InnovationMakesAmericaGreat
@diana_freed has received a CRA Trustworthy AI Research Fellowship, supporting early-career computing researchers who bring interdisciplinary expertise from the social sciences to infuse ethical and societal perspectives into Trustworthy AI development: cs.brown.edu/news/2025/07/0…
DSI is thrilled to welcome our new Director, Dr. Brenda Rubenstein, Vernon K. Krieble Professor of Chemistry and Professor of Physics. Rubenstein will lead DSI for the next three years. Read about Brenda's vision for DSI: dsi.brown.edu/news/2025-07-0…
Incoming DSI Director Brenda Rubenstein addressed members of Congress this week 👏 Read more on Brown News 👇
Brown University scientist Brenda Rubenstein urged members of Congress to support the American economy and national security by investing in quantum science research at the nation’s universities during a Tuesday, June 24, hearing. brown.edu/news/2025-06-2…
Scale alone is not enough for AI data. Quality and complexity are equally critical. Excited to support all of these for LLM developers with @SnorkelAI Data-as-a-Service, and to share our new leaderboard! — Our decade-plus of research and work in AI data has a simple point:…
We're happy to announce that effective as of July 1, 2025, faculty members @stevebach and @drsrinathsridha have received named chairs. Steve is now the Eliot Horowitz Assistant Professor in CS and Srinath is the John E. Savage Assistant Professor in CS: cs.brown.edu/news/2025/06/0…
New KDD 2025 paper: Can large language models (LLMs) reason like biomedical scientists? We introduce K-Paths, a retrieval framework for extracting reasoning paths from knowledge graphs (KGs) to aid drug discovery tasks. 👇 Thread:
Great to work on this benchmark with astronomers in our NSF-Simons CosmicAI institute! What I like about it: (1) focus on data processing & visualization, a "bite-sized" AI4Sci task (not automating all of research) (2) eval with VLM-as-a-judge (possible with strong, modern VLMs)
How good are LLMs at 🔭 scientific computing and visualization 🔭? AstroVisBench tests how well LLMs implement scientific workflows in astronomy and visualize results. SOTA models like Gemini 2.5 Pro & Claude 4 Opus only match ground truth scientific utility 16% of the time. 🧵