Matei Zaharia

@matei_zaharia

CTO at @Databricks and CS prof at @UCBerkeley. Working on data+AI, including @ApacheSpark, @DeltaLakeOSS, @MLflow, http://DSPy.ai. http://linkedin.com/in/mateizaharia

Berkeley, CA

Joined October 2010

1KFollowing

44KFollowers

Pinned

Matei Zaharia@matei_zaharia · Jun 11

Excited to launch Agent Bricks, a new way to build auto-optimized agents on your tasks. Agent Bricks uniquely takes a *declarative* approach to agent development: you tell us what you want, and we auto-generate evals and optimize the agent. databricks.com/blog/introduci…

matei_zaharia's tweet card. Discover Agent Bricks by Databricks — a new way to build production-ready AI agents using your data. Automatically evaluate, optimize, and scale agents with higher accuracy and lower cost.

248

106

40.0K

Matei Zaharia@matei_zaharia · Jul 17

This is a good opportunity to announce that I recently joined the research team at @databricks where I will be working alongside @jefrankle @rishabhs @matei_zaharia Erich Elsen, and many others on the hardest problems at the intersection of information retrieval and AI.

JJonathan Frankle@jefrankle · Jul 15

I'm at ICML 🇨🇦 and I'm hiring at @databricks. Visit our booth if you're interested. My scientific focus: It's 1972 in AI, there's an AI crisis, Dijkstra isn't here to save us, and maybe RL can. Why Databricks? The long road to AGI is being paved here and we have the real evals 🧵

6.0K

Matei Zaharia Retweeted

NovaSky@NovaSkyAI · Jul 17

The SkyRL roadmap is live! Our focus is on building the easiest-to-use high-performance RL framework for agents. We'd love your ideas, feedback, or code to guide the project: github.com/NovaSky-AI/Sky…

3.0K

Matei Zaharia@matei_zaharia · Jul 16

Does RL actually learn positively under random rewards when optimizing Qwen on MATH? Is Qwen really that magical such that even RLing on random rewards can make it reason better? Following prior work on spurious rewards on RL, we ablated algorithms. It turns out that if you…

GGokul Swamy@g_k_swamy · Jul 15

Recent work has seemed somewhat magical: how can RL with *random* rewards make LLMs reason? We pull back the curtain on these claims and find out this unexpected behavior hinges on the inclusion of certain *heuristics* in the RL algorithm. Our blog post: tinyurl.com/heuristics-con…

103

12.0K

Matei Zaharia@matei_zaharia · Jul 15

We're finding that what's needed in RL for enterprise tasks is pretty different than in foundation model training on math, code, etc. Catch @jefrankle and our team at ICML to talk about these problems!

JJonathan Frankle@jefrankle · Jul 15

Properties of our problems: * Semi-verifiability. Can LLM judges productively augment RLVR? How clean must they be? * Intermediate rewards. Signals we can exploit to make harder tasks tractable. * Real traces. Tons of human traces for imitation learning or environment building.

7.0K

Matei Zaharia Retweeted

Jonathan Frankle@jefrankle · Jul 15

216

35.0K

Matei Zaharia@matei_zaharia · Jul 15

The #SIGIR2025 Best Paper just awarded to the WARP engine for fast late interaction! Congrats to Luca Scheerer🎉 WARP was his @ETH_en MS thesis, completed while visiting us at @StanfordNLP. Incidentally, it's the fifth Paper Award for a ColBERT paper since 2020!* Luca did an…

OOmar Khattab@lateinteraction · Jul 14

📢 If you’re at #SIGIR2025 this week, make sure to be at Luca Scheerer’s paper talk: “WARP: An Efficient Engine for Multi-Vector Retrieval” (Wednesday 11am) WARP makes PLAID, the famous ludicrously fast ColBERT engine, another 3x faster on CPUs. With the usual ColBERT quality!

182

27.0K

Matei Zaharia Retweeted

Negar Arabzadeh@NegarEmpr · Jul 14

Come find me at the poster session now at #SIGIR2025! Let’s chat about LLM-based relevance judgments! @SIGIRConf

2.0K

Matei Zaharia@matei_zaharia · Jul 13

Yes, this is a description of how the dspy.SIMBA optimizer works. > a review/reflect stage along the lines of "what went well? what didn't go so well? what should I try next time?" etc. and the lessons from this stage feel explicit, like a new string to be added to the system…

AAndrej Karpathy@karpathy · Jul 13

Scaling up RL is all the rage right now, I had a chat with a friend about it yesterday. I'm fairly certain RL will continue to yield more intermediate gains, but I also don't expect it to be the full story. RL is basically "hey this happened to go well (/poorly), let me slightly…

374

373

74.0K

Matei Zaharia@matei_zaharia · Jul 12

Awesome read on Lucene's implementation of ACORN-1🔥🔥 Filtered vector search is everywhere! Efficient, general-purpose (predicate-agnostic) indices that can support those use cases are super, super powerful!! Try it out & check out our original paper dl.acm.org/doi/10.1145/36…

DDoug Turnbull@softwaredoug · Apr 14

Elasticsearch / Lucene adopts ACORN-1, which expands the exploration of nodes to ensure enough candidates that meet the filter By @benwtrent elastic.co/search-labs/bl…

5.0K

Matei Zaharia Retweeted

Nikita | Scaling Postgres@nikitabase · Jul 11

Separation of storage and compute doesn't sacrifice database performance. You can have both elasticity AND performance: neon.com/blog/separatio…

6.0K

Matei Zaharia Retweeted

NovaSky@NovaSkyAI · Jul 10

🔎 SkyRL + Search-R1 Training a multi-turn search agent doesn’t have to be complicated. With SkyRL, reproducing the SearchR1 recipe at high training throughput is quick and easy! We wrote up a detailed guide to show you how: novasky-ai.notion.site/skyrl-searchr1 1/N 🧵

144

126

16.0K

Matei Zaharia Retweeted

Pratiksha Thaker@prthaker_ · Jul 9

I'm very excited to share some new work arxiv.org/abs/2506.06488. This work started out in conversations with @thorn where we realized that shadow model MIAs couldn't be used to audit models for harmful content of children. See 🧵 for why, and our progress on solving this...

4.0K

Matei Zaharia Retweeted

Daniel Kang@daniel_d_kang · Jul 8

As AI agents near real-world use, how do we know what they can actually do? Reliable benchmarks are critical but agentic benchmarks are broken! Example: WebArena marks "45+8 minutes" on a duration calculation task as correct (real answer: "63 minutes"). Other benchmarks…

20.0K

Matei Zaharia Retweeted

Ivan Zhou@ivanzhouyq · Jul 7

I will be at #ICML next week! It'll be great to catch up with friends, old and new. Happy to chat about our work on Data + AI at @DbrxMosaicAI. We're growing our team and have openings for researchers and engineers in areas such as document intelligence, knowledge assistant, data…

8.0K

Matei Zaharia Retweeted

Nikhil Thorat@nsthorat · Jun 30

We just published a Databricks App template that shows how to: - Deploy a LangGraph agent asa Databricks app with a chat UI - Automatically monitor MLflow 3.0 traces on Databricks (including syncing to delta tables, with Unity Catalog governance of traces) I've also embedded my…

4.0K

Matei Zaharia Retweeted

Neon - Serverless Postgres@neondatabase · Jun 17

If you are: - An early-stage startup - Have raised up to $5M in venture funding - And are using Postgres, Apply to our Startup Program and get up to $100k in credits: neon.com/startups

5.0K

Matei Zaharia Retweeted

NovaSky@NovaSkyAI · Jun 26

✨Release: We upgraded SkyRL into a highly-modular, performant RL framework for training LLMs. We prioritized modularity—easily prototype new algorithms, environments, and training logic with minimal overhead. 🧵👇 Blog: novasky-ai.notion.site/skyrl-v01 Code: github.com/NovaSky-AI/Sky…

204

119

36.0K

Matei Zaharia Retweeted

uccl_project@uccl_proj · Jun 12

1/N 📢 Introducing UCCL (Ultra & Unified CCL), an efficient collective communication library for ML training and inference, outperforming NCCL by up to 2.5x 🚀 Code: github.com/uccl-project/u… Blog: uccl-project.github.io/posts/about-uc… Results: AllReduce on 6 HGX across 2 racks over RoCE RDMA

7.0K

Matei Zaharia Retweeted

tobi lutke@tobi · Jun 25

DSPy is my context engineering tool of choice

278

81.0K

Matei Zaharia@matei_zaharia · Jun 25

This is a pretty good article on how we are rethinking OLTP databases with Lakebase!

TThe New Stack@thenewstack · Jun 25

Are OLTP databases due for a radical rethink? @AlexWilliams reports from the @Databricks Data + AI Summit, where @rxin made the case for decoupling compute and storage in Postgres — treating data more like code. thenewstack.io/new-oltp-postg…

7.0K