SynthLabs (@synth_labs)

Pinned

S

SynthLabs@synth_labs · Feb 25

Releasing Big-MATH—the first heavily curated & verifiable dataset designed specifically for large-scale RL training & LLM reasoning! 📝 250,000+ problems, 47k NEW Q's ✅ 10x larger than existing datasets like MATH 🧑‍⚖️ Verifiable—we eliminated 400k+ problems Details below! 🧵👇

synth_labs's tweet image. Releasing Big-MATH—the first heavily curated &amp; verifiable dataset designed specifically for large-scale RL training &amp; LLM reasoning!

📝 250,000+ problems, 47k NEW Q's
✅ 10x larger than existing datasets like MATH
🧑‍⚖️ Verifiable—we eliminated 400k+ problems

Details below! 🧵👇

3

16

143

81

41.0K

Pinned

SynthLabs Retweeted

N

Nebius@nebiusai · Apr 4

Read how @synth_labs, a startup developing AI solutions tailored for logical reasoning, is advancing AI post-training with our @TractoAI: nebius.com/customer-stori… 🔹 Goal: Develop an ML system that empowers reasoning models to surpass pattern matching and implement sophisticated…

2

14

58

6

5.0K

SynthLabs Retweeted

n

nathan lile@NathanThinks · Jul 10

up _and_ left 😲

1

4

19

1

74.0K

S

SynthLabs@synth_labs · Jun 25

the future is about smart tokens

nnathan lile@NathanThinks · Jun 24

What if models could learn which problems _deserve_ deep thinking? No labels. Just let the model discover difficulty through its own performance during training. Instead of burning compute 🔥💸 on trivial problems, it allocates 5x more on problems that actually need it ↓

0

2

5

3

1.0K

S

SynthLabs@synth_labs · Jun 24

What if models could learn which problems _deserve_ deep thinking? No labels. Just let the model discover difficulty through its own performance during training. Instead of burning compute 🔥💸 on trivial problems, it allocates 5x more on problems that actually need it ↓

SSynthLabs@synth_labs · Jun 24

Our new method (ALP) monitors solve rates across RL rollouts and applies inverse difficulty penalties during RL training. Result? Models learn an implicit difficulty estimator—allocating 5x more tokens to hard vs easy problems, cutting overall usage by 50% 🧵👇1/10

1

6

37

11

3.0K

S

SynthLabs@synth_labs · Jun 5

Generative Reward Models impact compounds daily. way stronger interest now than when we published last fall 👇 many excellent recent extensions—cool seeing where researchers take GenRM

nnathan lile@NathanThinks · Oct 7

we bootstrapped our way to generalized meta-reasoning capabilities with generative reward models classical reward models can be worse than random on new reasoning tasks 🎲 we see improvements in robustness, generalization, interpretability and an opportunity to unify RLHF/RLAIF…

1

2

19

9

2.0K

S

SynthLabs@synth_labs · May 19

btw we have ongoing research on this front! we're open-science, pro-publication, and love collaboration. want to push this frontier forward? we're growing our SF team & always open to research partners—reach out, my DMs are open 📩

nnathan lile@NathanThinks · May 16

excellent work by @jaseweston & team—extending our "Generative Reward Models" work with RL (GRPO) to optimize LLM reasoning during judgment scalable (synthetic) evaluation continues to be AI's key bottleneck!

17

8

56

36

12.0K

S

SynthLabs@synth_labs · Mar 11

btw, random fun fact we pointed out months ago: the only MATH example @OpenAI published with o1 announcement included an unsubstantiated assumption 😬

nnathan lile@NathanThinks · Mar 11

> still hacks at a fairly high rate > we wouldn't notice this agent was misaligned Meanwhile, industry: aggressively distilling Meta-CoT slop directly into models 🫡

1

7

37

62

26.0K

SynthLabs Retweeted

N

Nebius@nebiusai · Feb 25

The final stop in our meetup series will be in San Francisco! 🌁 nebius.com/events/nebius-… Join us at Convene 100 Stockton near Union Square on Thursday, March 13, for a deep dive into our AI cloud. Our developers, AI R&D engineers and architects will share insights with the tech…

1

4

31

1

144.0K

S

SynthLabs@synth_labs · Mar 3

Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models Author's Explanation: x.com/synth_labs/sta… Overview: Big-Math, a dataset of over 250,000 high-quality math questions with verifiable answers, is purposefully designed for…

SSynthLabs@synth_labs · Feb 25

Releasing Big-MATH—the first heavily curated & verifiable dataset designed specifically for large-scale RL training & LLM reasoning! 📝 250,000+ problems, 47k NEW Q's ✅ 10x larger than existing datasets like MATH 🧑‍⚖️ Verifiable—we eliminated 400k+ problems Details below! 🧵👇

1

2

7

4

1.0K

SynthLabs Retweeted

D

Daniel van Strien@vanstriendaniel · Feb 25

Big-Math: Big-Math: Massive Math Dataset for RL Training - 10x larger than GSM8k/MATH - 3 core properties: uniquely verifiable, open-ended, closed-form - Human-validated 90%+ precision filters - Difficulty metrics for curriculum learning

2

27

134

76

19.0K