mok

@mokshith_v

ai x video @sievedata

San Francisco, CA

Joined April 2018

1KFollowing

1KFollowers

Pinned

mok Retweeted

Sieve@sievedata · Jun 19

Multi-modal models like GPT-4o are starting to outperform traditional models on core video/audio understanding tasks. One example is diarization in AI dubbing. Most tools break on multi-speaker videos. We're shipping a fix for this + major upgrades next week. Here's a preview.

628

Pinned

mok@mokshith_v · Jun 17

ahi's background is unique - from building the world's smallest batteries to training high quality TTS and diffusion models in the Computational Image Group at Rice. excited to be working with him. if you want to work on internet-scale computer vision problems as well, DM me :)

AAhi@ahitagnied · Jun 17

just completed week 1 at @sievedata & the energy is unreal. i’m shipping next-gen video understanding apis from whiteboard to prod, on the applied ml team. huge thanks to @mokshith_v & @thpicy for bringing me onboard, & to @jcbmrshll for the mentorship. excited for what’s next

318

Pinned

mok@mokshith_v · Jun 14

now who's gonna build the sonnet 3.5 equivalent in video models? something that doesn't top the leaderboards but wins on taste, vibes, and true utility.

JJustine Moore@venturetwins · Jun 14

There's a new AI video model topping the leaderboard: Seedance. It's #1 in the @ArtificialAnlys ranking for text-to-video and image-to-video. And after voting on 3k+ clips across dozens of models, I found the hype holds up. 5 things it's particularly good at (with examples) 👇

278

mok@mokshith_v · Jul 22

your ability to convince great people to work with you = your ability to clearly communicate what the fuck you actually do

161

mok@mokshith_v · Jul 21

so tl;dr - internet video is still useful but too many researchers are treating it as a complete replacement to the real thing and will soon learn a very bitter lesson because of it 🫡

SSergey Levine@svlevine · Jul 21

I wrote a fun little article about all the ways to dodge the need for real-world robot data. I think it has a cute title. sergeylevine.substack.com/p/sporks-of-agi

317

mok Retweeted

Abhinav Ayalur@thpicy · Jul 18

If you're a cracked infra engineer who: - loves Go, ETL, OLAP DBs, k8s - wants to change the future of video AI - loves seeing their work translate to $$$ we're aggressively hiring right now @sievedata. Dm me. 💜🚀

626

mok@mokshith_v · Jul 17

Sieve is growing fast and needs engineers!

mmok@mokshith_v · Jul 17

The last two months have been insane. I randomly came into the office this morning and decided to record this video on why there has literally never been a more exciting time to join @sievedata. We're working with leading research teams pushing the frontier of creative, robotics,…

4.0K

mok@mokshith_v · Jul 17

best part of the first 20 minutes of my day is selecting my desired caffeine intake mechanism alternating each day is good for the mind

206

mok@mokshith_v · Jul 17

i am so excited 💜

367

mok Retweeted

Avinash@avinashj_ · Jul 16

Just shipped Orbit. - Overlay running route & stats on photos - Track your PRs across any timeframe - Browse your run history like a commit log The app you use before you inevitably upload to Strava & Instagram. Details in 🧵

2.0K

mok@mokshith_v · Jul 11

did you know ByteDance (parent company of TikTok) has employed a Head of Robotics Research for the last 6 years? wonder what he was up to.

mokshith_v's tweet image. did you know ByteDance (parent company of TikTok) has employed a Head of Robotics Research for the last 6 years? wonder what he was up to.

268

mok@mokshith_v · Jul 2

more evals for hard to evaluate things 💜

SSieve@sievedata · Jul 2

We benchmarked 14 of the top AI dubbing tools, and the results shocked us. Even some of the popular names failed at preserving speaker identity or handling multi-speaker videos. For context, the evals were conducted by third-party native speakers across 8 languages.

371

mok@mokshith_v · Jun 24

i'm excited to finally share this with the world. foundation model improvements continue to allow our research team to make improvements up the stack and see material differences in output dubs. later this week we'll be sharing some updated, de-anonymized evals as well 🌎💜

SSieve@sievedata · Jun 24

Introducing Dubbing 3.0 - the highest quality AI video translator. - Handles multi-speaker video better than any provider - Expresses emotions better (e.g calm vs frustrated) - More natural, context-aware translations - Supports 30+ languages and accents

411

mok@mokshith_v · Jun 19

i am so excited for the role datasets like these will play in the future of robotics and gaming. unfortunate that this dataset in particular is only 720p, and not strictly curated towards more "interesting" scenarios, but still great work by the authors!

AAran Komatsuzaki@arankomatsuzaki · Jun 19

Sekai: A Video Dataset towards World Exploration A high-quality 5k hrs of egocentric worldwide video + audio dataset for world exploration, created from Youtube with high-quality annotations

468

mok@mokshith_v · Jun 18

video-native MJ aesthetics 😍 congrats to the entire team!

MMidjourney@midjourney · Jun 18

Introducing our V1 Video Model. It's fun, easy, and beautiful. Available at 10$/month, it's the first video model for *everyone* and it's available now.

231

mok@mokshith_v · Jun 16

i was today days old when i realized i need to up my slack emoji game

772