argmax
@argmaxinc
Foundation Models On Device
Introducing SpeakerKit State-of-the-art on-device speaker diarization: - 10 minutes of audio processed in 3 seconds - 10 megabytes in total - 6-year-old devices supported Details and links to the demo app are in the thread.

Argmax Pro SDK 1.6.1 Changelog: - Truncated words from Parakeet are fixed - Empty transcripts for very short audio are fixed - `VoiceActivityDetector.modelVAD` is now compatible with Parakeet models Thanks for all the reproducible reports from developers as well as end-users!…
Major update to Argmax Pro SDK dropped today! - Real-time transcription API now supports multiple concurrent sessions - Diarized transcriptions have 40% lower error rates - New high-level APIs to streamline developer experience These improvements come at no additional latency or…
@superwhisperapp's local speech-to-text models have ~150 ms latency with the @argmaxinc backend but cloud LLM models for post-processing were adding huge latency on top. Thanks for reducing it @rauchg!
It's been fun collaborating with @superwhisperapp, a blazing fast AI speech-to-text app, with support for local and cloud models. By putting @vercel CDN in front of their model API, they're seeing up to 350ms gains in some geos 🤯 To be clear, this is 350ms+ faster by *just*…
It's been fun collaborating with @superwhisperapp, a blazing fast AI speech-to-text app, with support for local and cloud models. By putting @vercel CDN in front of their model API, they're seeing up to 350ms gains in some geos 🤯 To be clear, this is 350ms+ faster by *just*…
It is raining speech models today!! @NVIDIAAI also quietly reclaimed the lead on the OpenASR leaderboard:
Introducing the world's best (and open) speech recognition models!
Going to @icmlconf and want to learn about the frontiers of on-device AI? Catch my talk on July 18 after @DAlistarh and before @songhan_mit! I will discuss the secret sauce and benchmarks behind how @argmaxinc built WhisperKit to outmatch several top cloud providers for…