Tengyu Ma (@tengyuma)

Tengyu Ma Retweeted

V

📢 Meet voyage-3.5 and voyage-3.5-lite! • flexible dim. and quantizations • voyage-3.5 & 3.5-lite (int8, 2048 dim.) are 8% & 6% more accurate than OpenAI-v3-large, and 2.2x & 6.5x cheaper, resp. Also 83% less vectorDB cost! • 3.5-lite ~ Cohere-v4 in quality, but 83% cheaper.

1

12

40

11

9.0K

Tengyu Ma Retweeted

S

Stanford AI Lab@StanfordAILab · Apr 18

Stanford AI Lab (SAIL) is excited to announce new SAIL Postdoctoral Fellowships! We are looking for outstanding candidates excited to advance the frontiers of AI with our professors and vibrant community. Applications received by the end of April 30 will receive full…

6

82

204

96

59.0K

T

Tengyu Ma@tengyuma · Feb 24

1/ Congrats @dittycheria @tengyuma @sahirazam + the @VoyageAI team on their acquisition by @MongoDB today! As a founding investor, @conviction believed that better embedding and re-ranking models were critical to robust AI-powered search/retrieval (a key enterprise AI use case)

DDev Ittycheria@dittycheria · Feb 24

The risk of hallucinations currently holds enterprises back from deploying AI apps. Excited to share that VoyageAI has joined MongoDB to make high-quality AI-powered search and retrieval easy, enabling organizations to build trustworthy AI apps at scale. mongodb.com/blog/post/rede…

4

5

85

11

21.0K

Tengyu Ma Retweeted

s

sarah guo // conviction@saranormous · Feb 24

Resurfacing this about neural retrieval, training embeddings models, and the state of RAG from @VoyageAI founder @tengyuma on @NoPriorsPod: youtu.be/MYsx9POL_x8?fe…

0

2

24

7

11.0K

T

Tengyu Ma@tengyuma · Feb 24

We joined @MongoDB! @VoyageAI’s best-in-class embedding models and rerankers will be part of MongoDB’s best-in-class database, powering mission-critical AI applications with high-quality semantic retrieval capability. A huge thank you to everyone with us on this journey, and to…

tengyuma's tweet image. We joined @MongoDB! @VoyageAI’s best-in-class embedding models and rerankers will be part of MongoDB’s best-in-class database, powering mission-critical AI applications with high-quality semantic retrieval capability.

A huge thank you to everyone with us on this journey, and to…

55

25

366

22

40.0K

T

Tengyu Ma@tengyuma · Feb 23

It's tough when one brain has to handle two "PR"s 😇😇—public relations and pull requests. I felt that I am running MoE— every time I see PR, my visual cortex do a quick routing to the right part of brain.

2

1

52

2

12.0K

T

Tengyu Ma@tengyuma · Feb 5

and SoTA among whole-proof generation methods on miniF2F, ProofNet, and PutnamBench, and double the previous best results on LeanWorkBook. (reposting because it seems that this table has much more views 😝)

TTengyu Ma@tengyuma · Feb 4

RL + CoT works great for DeepSeek-R1 & o1, but: 1️⃣ Linear-in-log scaling in train & test-time compute 2️⃣ Likely bounded by difficulty of training problems Meet STP—a self-play algorithm that conjectures & proves indefinitely, scaling better! 🧠⚡🧵🧵 arxiv.org/abs/2502.00212

1

8

61

25

20.0K

Tengyu Ma Retweeted

T

Tengyu Ma@tengyuma · Feb 4

RL + CoT works great for DeepSeek-R1 & o1, but: 1️⃣ Linear-in-log scaling in train & test-time compute 2️⃣ Likely bounded by difficulty of training problems Meet STP—a self-play algorithm that conjectures & proves indefinitely, scaling better! 🧠⚡🧵🧵 arxiv.org/abs/2502.00212

17

109

557

435

86.0K

T

Tengyu Ma@tengyuma · Jan 9

Thanks so much for the report @spyced. We are serious about evaluation and try hard to provide the best quality for REAL-WORLD use cases.

JJonathan Ellis@spyced · Jan 9

I ran a fresh evaluation of embedding models tuned for semantic retrieval, including the newest models from Voyage, Jina, Cohere, and NVIDIA. Link in thread.

1

2

21

7

7.0K

T

Tengyu Ma@tengyuma · Jan 8

Proud to share our best model yet, pushing boundaries again and outperforming all models on all domains (except voyage-code-3 on code). Our binary, 1024-dim embeddings are 5.53% better than OpenA, float, 3072 dim. If you spent $10k monthly on storage, now it’s $104 with us!

VVoyage AI by MongoDB@VoyageAI · Jan 8

📢 Announcing the new SOTA voyage-3-large embedding model! • 9.74% over OpenAI and +20.71% over Cohere • flexible dim. (256-2048) and quantizations (float, int8, binary) • 8.56% over OpenAI with 1/24x storage cost • 1.16% over OpenAI with 1/192x storage cost ($10K → $52)

9

27

230

87

35.0K

Tengyu Ma Retweeted

V

Voyage AI by MongoDB@VoyageAI · Jan 8

📢 Announcing the new SOTA voyage-3-large embedding model! • 9.74% over OpenAI and +20.71% over Cohere • flexible dim. (256-2048) and quantizations (float, int8, binary) • 8.56% over OpenAI with 1/24x storage cost • 1.16% over OpenAI with 1/192x storage cost ($10K → $52)

3

35

187

102

56.0K