Daily
@trydaily
Build human and AI ultra low latency conversations. We maintain Pipecat with contributions from the developer community. https://daily.co https://pipecat.ai
I love this demo from @thorwebdev, and the esp32 hardware hacking project that he links to in the thread. You can run "serverless" WebRTC for ultra low-latency audio, with no external dependencies of infrastructure required, on everything from tiny microcontrollers like this all…
Is this the tiniest little voice agent yet?! My @elevenlabsio voice clone running on an esp32 microcontroller via @pipecat_ai and WebRTC! 🔥 Story time: I recently caught up with Danilo Campos who is building the awesome DeskHog (seriously, check it out!) at @posthog and he…
So cool! A game of Guess Who with Pipecat on an ESP32 👇
We played a game of Guess Who? using @Speechmatics diarization that knows who’s talking, running on a tiny ESP32 using WebRTC via @pipecat_ai from @trydaily Yes. Really. 😎 Matt Barty and I went up against “Humphrey”, trying to guess a mystery Brit ... 🎩 With diarization,…
How to go from ideas to real-time. We’re thrilled to welcome @kwindla, CEO & Co-Founder at @trydaily, to the VapiCon stage!
SF Builders: join us to build real-time voice agents! Tackle real-time speech synthesis, AI orchestration, production deployment, and more. Leave with working code + tips. Only a few spots left! 👉 RSVP: lu.ma/genailoft-deep… 📆 July 28 | 5-7 PM 📍 AWS GenAI Loft in SF…
You don't need a WebRTC server for voice agents. If you're deploying your own voice AI infrastructure, you should almost certainly be using the new(†) serverless WebRTC approach. Serverless is much simpler, which translates to faster development, better scaling, and higher…
Smart Turn v2: open source, native audio turn detection in 14 languages. New checkpoint of the open source, open data, open training code, semantic VAD model on @huggingface, @FAL, and @pipecat_ai. - 3x faster inference (12ms on an L40) - 14 languages (13 more than v1, which…
Hot takes (from me) and reasoned discussion (from Sam) on: The current best practices for building production, enterprise scale, voice agents. What models to use, how to think about infrastructure, and what the solved and unsolved problems in voice AI are, right now. Why I…
In this episode, @kwindla Kramer, co-founder and CEO of @trydaily and creator of the open source @pipecat_ai framework, joins us to discuss the architecture and challenges of building real-time, production-ready conversational voice AI. Kwin breaks down the full stack for voice…
In this episode, @kwindla Kramer, co-founder and CEO of @trydaily and creator of the open source @pipecat_ai framework, joins us to discuss the architecture and challenges of building real-time, production-ready conversational voice AI. Kwin breaks down the full stack for voice…
If you've been wanting to get started with voice AI, this is a great tutorial. At the end of this short video you'll have a production-ready, auto-scaling, customizable voice agent deployed to the cloud. The agent uses: - @AssemblyAI's fast, accurate, streaming transcription…
Voice AI hardware + WebRTC + Pipecat Here's a @pipecat_ai SmallWebRTCTransport client for the ESP32-S3 family of embedded devices. The SmallWebRTCTransport is a serverless WebRTC connection designed for voice AI. Link to code and resources below ... @aconchillo wrote the ESP32…
Pipecat Flows is a context engineering library for voice agents. Today's LLMs are *great* at two things: natural, open-ended conversation; and extracting structured data from unstructured input. They are not great (yet) at reliably following detailed instructions throughout a…
Two big voice agent releases today: Pipecat 0.0.72 and voice-ui-kit 0.0.1. The Pipecat release includes new features for building and debugging complicated voice agents. Watchdog timers give you the ability to set per-processor notifications so you can track down slow or…
Real pain points emerge when connecting with already existing on prem pbx stack. Tooling and so many nitty gritty things needed to take care of. Thanks @kwindla and team @pipecat_ai for already going through that path and present us an intuitive platform.
Nice PR from @YousifAstar adding streamable_http support to Pipecat's `MCPClient`! github.com/pipecat-ai/pip… If you're doing MCP-related things with voice AI, I'd love to hear about both what's working well for you and what issues you're hitting.
Live now! If you're building voice AI with Gemini, join office hours with @shresbm, Group PM for the Gemini APIs
.@shresbm and I gave a talk at @aiDotEngineer World's Fair about building real-world voice agents that leverage the most advanced features of today's models, APIs, and frameworks. The themes of the talk were: - What the moving parts involved in building production voice agents…
Imagine any patient being able to talk to any provider, and get the support they need to access care. This Vocality demo showcases how an AI Medical Interpreter handles noisy, real-world settings — like TVs blaring in the background! @pipecat_ai @heytavus
Most voice AI agents fall apart in noisy environments, especially with cross talk, music, or a TV blaring. At Vocality, we've spent the last 6 months making sure ours doesn't. Here’s a live demo of our medical interpeter and a short thread on what worked, didn't work, and…
Talk to Cartesia speech-to-text about Cartesia speech-to-text. Cartesia launched a streaming STT model today, called Ink-Whisper, that's optimized for realtime voice AI. @pipecat_ai has launch-day support for this new model, so I figured I'd talk to the model about itself.…