Wayde Gilliam

@waydegilliam

Helping customers look at their data @braintrustdata

San Diego, CA

Joined June 2016

207Following

2KFollowers

Pinned

Wayde Gilliam@waydegilliam · Jul 24

This is 🔥

AAnkur Goyal@ankrgyl · Jul 24

I've known @brettberson for over a decade, so this was a really fun and candid conversation about Braintrust and some of my wacky opinions about how to build a product. I think he's a world class interviewer and feel honored to be part of this series. Take a listen :)

225

Pinned

Wayde Gilliam Retweeted

Braintrust@braintrustdata · Jul 18

We've learned critical lessons from helping teams ship reliable LLM-powered products. Organizations using Braintrust run over 3,000 evaluations daily, providing us with unique insights into what actually works. Read more on the blog: braintrust.dev/blog/five-less…

520

Wayde Gilliam Retweeted

Hamel Husain@HamelHusain · 22 h

Reorganized the evals FAQ into categories, since there are so many now! You can also download the FAQ in different formats (pdf, markdown) from the sidebar on the page directly. hamel.dev/blog/posts/eva…

223

303

20.0K

Wayde Gilliam Retweeted

Hamel Husain@HamelHusain · Jul 26

If you want ONE place to keep up with AI coding agents, you should pay attention to what @isaac_flath and @intellectronica are putting together: bit.ly/coding-ai. I've worked with both, and they have phenomenal taste. Isaac: 7+ years working on dev tools in both open…

104

7.0K

Wayde Gilliam@waydegilliam · Jul 25

Qwen released their updated "thinking" model today. It thinks really hard! Took 166 seconds to think through the details of drawing me a pelican on a bicycle. The finished drawing wasn't great but the thoughts behind it were fun to see. simonwillison.net/2025/Jul/25/qw…

QQwen@Alibaba_Qwen · Jul 25

🚀 We’re excited to introduce Qwen3-235B-A22B-Thinking-2507 — our most advanced reasoning model yet! Over the past 3 months, we’ve significantly scaled and enhanced the thinking capability of Qwen3, achieving: ✅ Improved performance in logical reasoning, math, science & coding…

112

26.0K

Wayde Gilliam@waydegilliam · Jul 24

evals are all you need

TTanishq Abraham back from ICML@iScienceLuvr · Jul 24

Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains 'We introduce Rubrics as Rewards (RaR), a framework that uses structured, checklist-style rubrics as interpretable reward signals for on-policy training with GRPO. Our best RaR method yields up to a relative…

394

181

33.0K

Wayde Gilliam Retweeted

Ankur Goyal@ankrgyl · Jul 24

something i've been thinking about recently there are no more engineers, designers, PMs, etc there are product owners. product owners write code, solicit feedback, drive roadmap, collaborate, talk to customers, answer support tickets, etc.

368

111

29.0K

Wayde Gilliam Retweeted

Simon Willison@simonw · Jul 24

GitHub released Spark yesterday, their extremely well crafted prompt-to-app platform for creating and iterating on React apps with user auth and persistent storage I like it a lot! I reverse engineered it with Spark itself, the details are fascinating simonwillison.net/2025/Jul/24/gi…

120

1.0K

156.0K

Wayde Gilliam@waydegilliam · Jul 23

Thanks to our new faststripe lib, it's now this easy to integrate @stripe into your python app:

NNathan Cooper@ncooper57 · Jul 23

At @answerdotai, we integrate @stripe into lots of projects. Every time, I found myself doing the same dance: create product, create price, create checkout session. Then hunting docs for parameters for each. So we built FastStripe, a self-documenting Stripe SDK that's easy to…

238

105

23.0K

Wayde Gilliam@waydegilliam · Jul 23

I suspect this is largely because asynchronous programming is a new skill. If you're not good at it, then you won't gain productivity.

JJosh Constine 📶🔥@JoshConstine · Jul 23

Developers using AI coding copilots complete 98% more code changes and 21% more tasks... ...but their companies don't ship faster due to downstream bottlenecks and a 91% increase in code review time Any journalists want to see this study of 10K developers before it publishes?

2.0K

Wayde Gilliam Retweeted

Prajwal Tomar@PrajwalTomar_ · Jul 22

Cursor Pro Tip: Always end your prompts with: “Explain the full approach you’d take to implement this. Just tell, don’t code.” Cursor will map out its entire plan. Review it, tweak if needed, then let it execute. It makes a HUGE difference in how well Cursor executes your…

478

675

34.0K

Wayde Gilliam Retweeted

Ankur Goyal@ankrgyl · Jul 22

things i install when i ssh: * tmux * neovim * claude * uv * gh * htop * dstat

1.0K

Wayde Gilliam@waydegilliam · Jul 22

This looks sick!

IIsaac Flath@isaac_flath · Jul 22

I'm launching Context Engineering For Coding to make AI assisted coding more efficient, and it works with Cursor, Claude Code, Copilot, all of them. Here's a 30% off discount link for early enrollers maven.com/kentro/context…

340

Wayde Gilliam@waydegilliam · Jul 21

ngl I'm most excited about this cage match between Eval vendors. They are going to solve the homework assignments, side-by-side. @hwchase17 (Langsmith) vs @mikeldking (Phoenix) vs @waydegilliam (Braintrust) maven.com/parlance-labs/…

SShreya Shankar@sh_reya · Jul 21

Excited to kick off a much improved version of our AI evals course tomorrow (link in replies). 💫 We've added dedicated homework sessions, an updated course reader & lectures that incorporates 100s of questions from cohort 1. There’s more hands-on/live error analysis, plus…

7.0K

Wayde Gilliam Retweeted

Aakash Gupta@aakashg0 · Jul 19

"Vibe checks" are great—until you need to scale. In this clip, @HamelHusain and @sh_reya break down why relying on human intuition isn’t enough when it comes to evaluating product or model quality at scale. Instead, they explain how to codify those gut checks into scalable,…

7.0K

Wayde Gilliam Retweeted

Anthropic@AnthropicAI · Jul 17

We've launched Claude for Financial Services. Claude now integrates with leading data platforms and industry providers for real-time access to comprehensive financial information, verified across internal and industry sources.

152

567

6.0K

3.0K

732.0K

Wayde Gilliam Retweeted

Hamel Husain@HamelHusain · Jul 18

Just spotted this for the first time "Simply paste the URL of this blog post into Claude and tell it to set it up for you." Blog posts written for the computer. Amazing. steipete.me/posts/command-…

347

526

29.0K

Wayde Gilliam@waydegilliam · Jul 17

Here’s a teaser from my opening lecture next week. Join us if you too are interested in seeing me shape shift into a fiery brain as I intro @braintrustdata 🧠+📈 maven.com/parlance-labs/…

3.0K

Wayde Gilliam Retweeted

Ankur Goyal@ankrgyl · Jul 17

people seem to really love Loop. every day, people ask, "how do I build an agent like this?" the answer is simple :) use @braintrustdata

2.0K

Wayde Gilliam@waydegilliam · Jul 17

LFG!!! 🧠+🤝

HHamel Husain@HamelHusain · Jul 17

The eval space is the most intense battle for AI market share I have seen second to coding agents. This is why we will have Arize & Braintrust go head-to-head. They will each show how to complete our 5 homework assignments using their tools . Over 1k students learning about…

1.0K