Cartesia
@cartesia_ai
The fastest, ultra-realistic voice AI platform. http://discord.gg/cartesia
We've raised a $64M Series A led by @kleinerperkins to build the platform for real-time voice AI. We'll use this funding to expand our team, and to build the next generation of models, infrastructure, and products for voice, starting with Sonic 2.0, available today. Link below…
👑 We’re #1! Sonic-2 leads @Labelbox’s Speech Generation Leaderboard topping out in speech quality, word error rate, and naturalness. Build your real-time voice apps with the 🥇 best voice AI model. ➡️ labelbox.com/leaderboards/s…
We’re thrilled to partner with @regal_ai, the AI agent platform for enterprise CX! Our ultra-realistic, low-latency voices are now available to use with Regal’s enterprise-grade voice AI agents. This collaboration will enable Regal to deliver more natural and responsive customer…

Voice is the new UX frontier—and that includes the kitchen 🧑🍳 Caught mid-fish fry and need to consult next steps? 🐟See our STT power this hands-free demo from our friends at @CerebrasSystems!
do you ever have a fish to fry but your hands are too dirty? our new intern @imbaime built a crazy real-time voice to browser automation and this was his demo... the stack 🤖 > @CerebrasSystems for fast inference > @browserbasehq for browser automation > @Cartesia for voice…
📢 ICML friends — don’t miss this! Albert Gu, June Hwang, and Brandon Wang will be hosting a meet & greet at the Cartesia booth to chat about their recent H-Net Paper: arxiv.org/abs/2507.07955. 🗓️ Thursday, July 17 🕛 12:00 PM 📍 Cartesia AI Booth | ICML Expo Floor Come by to…

Tokenization is just a special case of "chunking" - building low-level data into high-level abstractions - which is in turn fundamental to intelligence. Our new architecture, which enables hierarchical *dynamic chunking*, is not only tokenizer-free, but simply scales better.
Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical network that replaces tokenization with a dynamic chunking process directly inside the model, automatically discovering and operating over meaningful units of data
Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical network that replaces tokenization with a dynamic chunking process directly inside the model, automatically discovering and operating over meaningful units of data
🚨 𝗖𝗮𝗿𝘁𝗲𝘀𝗶𝗮 𝗶𝘀 𝗵𝗲𝗮𝗱𝗶𝗻𝗴 𝘁𝗼 𝗜𝗖𝗠𝗟! 🚨 We’ll be on the exhibitor floor all week — come say hi! 👋 Check out what we're building in voice, meet the team, and geek out with us on the future of AI architectures. Whether you’re a researcher, engineer, or just…

Check out this blog from our co-founder @_albertgu on research in architectures and tokenization. Lots of new things coming...
I converted one of my favorite talks I've given over the past year into a blog post. "On the Tradeoffs of SSMs and Transformers" (or: tokens are bullshit) In a few days, we'll release what I believe is the next major advance for architectures.
Ab live hai: India ke liye Sonic TTS, Hinglish mein 🇮🇳 | Now live: Sonic TTS in Hinglish for India 🇮🇳 ♒ Fluid transitions between English and Hindi ⚡ Lowest-latency TTS with deployments in India 👋 Local, authentic, friendly voices Listen here: cartesia.ai/india
📅 Join our co-founder @jundesai at GenAI Model Training & Inference Innovators Night hosted by @AWSAI in SF! Hear the latest innovation from our team and listen to our panel on building, scaling, and bringing AI models to market. Calling founders, builders, and engineers…

Thanks to @SapphireVC for hosting our co-founder @bclyang at the Hypergrowth Engineering Summit. He shared that voice is the next UX frontier and building voice agents requires excellence at every layer. Great to see so many technical leaders leaned in! #SapphireHypergrowthEng…

Introducing @cartesia_ai Ink-Whisper on Vapi! Slow STT ruins conversations. With Ink-Whisper, just one line in your Vapi config gets you: • Faster, streaming-optimized transcripts • Better handling of background noise and accents Start using it on Vapi today.
Talk to Cartesia speech-to-text about Cartesia speech-to-text. Cartesia launched a streaming STT model today, called Ink-Whisper, that's optimized for realtime voice AI. @pipecat_ai has launch-day support for this new model, so I figured I'd talk to the model about itself.…
Building voice agents? Meet Ink-Whisper: the fastest, most affordable streaming speech-to-text model. 🌎 Optimized for accuracy in real-world conditions 👯 Pair with our Sonic text-to-speech → fastest duo in voice AI 🔌 Plugs into @Vapi_AI,@pipecat_ai, @livekit Read more:…

Headed to @CustContactWeek in Vegas next week? Come find us on the floor! We’re building the next generation of real-time Voice AI–faster, more flexible, and ready for the enterprise–and we’d love to meet you. Swing by our booth to see what we’ve been working on, chat with two of…

We’re building for developers scaling voice AI. Introducing two features to make collaboration and visibility easier: 🤝 Organizations and 📊 Dashboards. 🤝 The Organizations feature gives teams shared access to API keys, custom voices, and billing–all under one account.…
