kwindla
@kwindla
Infrastructure and developer tools for real-time voice, video, and AI. @trydaily // ᓚᘏᗢ // @pipecat_ai
Introducing Pipecat Cloud, infrastructure for open source voice AI agents. If you're building voice AI agents with @pipecat_ai, you have lots of options for hosting your agents: anywhere you can run a Python process and terminate WebSocket or WebRTC connections. But managing…


The team at @Speechmatics just shipped a really clean integration of realtime speaker diarization for voice agents. I've tinkered quite a bit with multi-speaker voice agent pipelines, and this is the best implementation I've seen so far. Voice AI in 2025 is at a really…
We played a game of Guess Who? using @Speechmatics diarization that knows who’s talking, running on a tiny ESP32 using WebRTC via @pipecat_ai from @trydaily Yes. Really. 😎 Matt Barty and I went up against “Humphrey”, trying to guess a mystery Brit ... 🎩 With diarization,…
It's great to see the traction @KaranVaidya6 and team have been getting. Going back to the early days of the current AI era (you know, two years ago) Karan's work was always super impressive to me, and I always learn a lot every time we talk.
Agents aren’t reliable. They don’t learn from experience. At @composiohq, we provide skills that evolve with your agents @lightspeedvp gave us $25M to make agents usable
🤩This is a great opportunity to talk to an expert about your evals hopes and dreams.
Doing eval office hours this week in SF, few slots left, dm if you’re interested
> Any licensing restrictions for commercial use for the entire pipeline? The smart-turn model is completely open. No restrictions. (BSD license, just to avoid any ambiguity.) Use it, modify it, fork it. Contribute back if you want to. Don't if you don't. The pipeline in the…
Any licensing restrictions for commercial use for the entire pipeline?
Hot takes (from me) and reasoned discussion (from Sam) on: The current best practices for building production, enterprise scale, voice agents. What models to use, how to think about infrastructure, and what the solved and unsolved problems in voice AI are, right now. Why I…
In this episode, @kwindla Kramer, co-founder and CEO of @trydaily and creator of the open source @pipecat_ai framework, joins us to discuss the architecture and challenges of building real-time, production-ready conversational voice AI. Kwin breaks down the full stack for voice…
I love this demo from @thorwebdev, and the esp32 hardware hacking project that he links to in the thread. You can run "serverless" WebRTC for ultra low-latency audio, with no external dependencies of infrastructure required, on everything from tiny microcontrollers like this all…
Is this the tiniest little voice agent yet?! My @elevenlabsio voice clone running on an esp32 microcontroller via @pipecat_ai and WebRTC! 🔥 Story time: I recently caught up with Danilo Campos who is building the awesome DeskHog (seriously, check it out!) at @posthog and he…
> So maybe perfect turn detection shouldn’t be the goal. I've had several conversations lately about a general version of this idea: don't define success for your AI system relative to a hypothetical "perfect" performance; instead benchmark against the strengths, weaknesses, and…
Humans indeed do this all the time. And we get it wrong and either talk over each other or apologize and cede. I saw some stat that in group conversation we talk over each other or laugh / make phatic sounds ~15% of the time. So maybe perfect turn detection shouldn’t be the…
AGI has been achieved. After many months of hearing the same jokes from *all* the LLMs, GPT-4.1 just told me a new one. And I laughed out loud. And I heard it on my @MentraLabs glasses. (It's a hackathon weekend!) Here's the trace ...
