Philipp Fränken (@jphilippfranken)

Pinned

P

Philipp Fränken@jphilippfranken · Dec 13

Presenting this tomorrow at @NeurIPSConf East Exhibit Hall A-C #2111 (4:30 p.m. PST — 7:30 p.m. PST). Come along if you want to chat about synthetic preference data with @gandhikanishk

PPhilipp Fränken@jphilippfranken · Apr 23, 2024

Constitutional AI showed LMs can learn to follow constitutions by labeling their own outputs. But why can't we just tell a base model the principles of desired behavior and rely on it to act appropriately? Introducing SAMI: Self-Supervised Alignment with Mutual Information!

0

5

22

5

15.0K

P

Philipp Fränken@jphilippfranken · Jun 30

It turns out that a lot of the most interesting behavior of LLMs can be explained without knowing anything about architecture or learning algorithms. Here we predict the rise (and fall) of in-context learning using hierarchical Bayesian methods.

EEkdeep Singh@EkdeepL · Jun 28

🚨New paper! We know models learn distinct in-context learning strategies, but *why*? Why generalize instead of memorize to lower loss? And why is generalization transient? Our work explains this & *predicts Transformer behavior throughout training* without its weights! 🧵 1/

4

20

109

77

18.0K

Philipp Fränken Retweeted

T

Tianyu Hua@tianyu_hua · Jun 13

🚨 New benchmark alert! 🚨 Can today’s LLMs implement tomorrow’s research ideas? We put them to the test. Introducing #ResearchCodeBench: 212 tasks from 2024–25 ML papers and code, most released after any model’s training cutoff. 🔗 researchcodebench.github.io 🧵

3

26

87

55

11.0K

P

Philipp Fränken@jphilippfranken · Jun 5

Tokasaurus is out! Happy Throughput Thursday to those who celebrate :)

JJordan Juravsky@jordanjuravsky · Jun 5

Happy Throughput Thursday! We’re excited to release Tokasaurus: an LLM inference engine designed from the ground up for high-throughput workloads with large and small models. (Joint work with @achakravarthy01, @ryansehrlich, @EyubogluSabri, @brad19brown, @jshetaye,…

0

3

10

1

595

Philipp Fränken Retweeted

J

Jordan Juravsky@jordanjuravsky · Jun 5

Happy Throughput Thursday! We’re excited to release Tokasaurus: an LLM inference engine designed from the ground up for high-throughput workloads with large and small models. (Joint work with @achakravarthy01, @ryansehrlich, @EyubogluSabri, @brad19brown, @jshetaye,…

7

47

203

76

41.0K

Philipp Fränken Retweeted

K

Kanishk Gandhi@gandhikanishk · Mar 4

New Paper!! We try to understand why some LMs self-improve their reasoning while others hit a wall. The key? Cognitive behaviors! Read our paper on how the right cognitive behaviors can make all the difference in a model's ability to improve with RL! 🧵1/13

22

183

945

973

182.0K

P

Philipp Fränken@jphilippfranken · Feb 28

This is the dataset we curated for our own reasoning experiments. There is a lot of reasoning data coming out now, but we spend extra time on this to make sure all the problems are high-quality and suitable for RL training!

nnathan lile@NathanThinks · Feb 28

thrilled to see Big-MATH climbing to #3️⃣ on @huggingface—clear signal the community wants more high-quality, verifiable RL datasets. grateful to everyone who’s been liking, downloading, and supporting ❤️

2

11

52

21

10.0K

P

Philipp Fränken@jphilippfranken · Feb 28

thrilled to see Big-MATH climbing to #3️⃣ on @huggingface—clear signal the community wants more high-quality, verifiable RL datasets. grateful to everyone who’s been liking, downloading, and supporting ❤️

SSynthLabs@synth_labs · Feb 25

Releasing Big-MATH—the first heavily curated & verifiable dataset designed specifically for large-scale RL training & LLM reasoning! 📝 250,000+ problems, 47k NEW Q's ✅ 10x larger than existing datasets like MATH 🧑‍⚖️ Verifiable—we eliminated 400k+ problems Details below! 🧵👇

3

7

21

7

10.0K

Philipp Fränken Retweeted

n

noahdgoodman@noahdgoodman · Feb 28

A note on hyperbole, halo, and language models. No not about startup valuations!

1

3

12

2

2.0K

Philipp Fränken Retweeted

n

noahdgoodman@noahdgoodman · Feb 28

arxiv.org/abs/2502.06204 work with the amazing @tsvilodub @gandhikanishk @HaoranZhaoHRZ @jphilipp95 @meanwhileina

0

3

11

0

584

Philipp Fränken Retweeted

R

Reid Hoffman@reidhoffman · Jan 27

Today, I launched Manas AI –– a full stack AI company setting out to shift drug discovery from a decade-long process to one that takes a few years; bringing life-saving treatments to patients faster than ever.

728

377

4.0K

984

928.0K

Philipp Fränken Retweeted

D

Dan Hendrycks@DanHendrycks · Jan 23

We’re releasing Humanity’s Last Exam, a dataset with 3,000 questions developed with hundreds of subject matter experts to capture the human frontier of knowledge and reasoning. State-of-the-art AIs get <10% accuracy and are highly overconfident. @ai_risk @scaleai

209

776

5.0K

2.0K

1.1M

P

Philipp Fränken@jphilippfranken · Jan 24

Scaling inference-time interaction

YYilong Qin@yilongqin · Jan 23

As we enter the world of test-time compute, we are seeing increasing returns by simply letting our agents do their thing for longer. For the first time, we are running our agent for hundreds of steps on these benchmarks. Instead of accumulating errors, CUA introspects, updates,…

0

3

17

5

2.0K

Philipp Fränken Retweeted

S

SynthLabs@synth_labs · Jan 14

Ever watched someone solve a hard math problem? Their first attempt is rarely perfect. They sketch ideas, cross things out, and try new angles. This process of exploration is key to human reasoning and our latest research formalizes this as Meta Chain-of-Thought (1/8) 🧵👇

7

43

224

159

39.0K

Philipp Fränken Retweeted

R

Rafael Rafailov @ NeurIPS@rm_rafailov · Jan 9

We have a new position paper on "inference time compute" and what we have been working on in the last few months! We present some theory on why it is necessary, how does it work, why we need it and what does it mean for "super" intelligence.

24

228

1.0K

171.0K

Philipp Fränken Retweeted

A

Aran Komatsuzaki@arankomatsuzaki · Jan 9

SynthLabs + Stanford presents: Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought Proposes Meta Meta-CoT, which extends CoT by explicitly modeling the underlying reasoning required to arrive at a particular CoT

17

137

653

634

82.0K

P

Philipp Fränken@jphilippfranken · Dec 13

Presenting this cool paper led by @jphilippfranken Come by today at 4.30 if you are around :)

PPhilipp Fränken@jphilippfranken · Dec 13

Presenting this tomorrow at @NeurIPSConf East Exhibit Hall A-C #2111 (4:30 p.m. PST — 7:30 p.m. PST). Come along if you want to chat about synthetic preference data with @gandhikanishk

0

2

10

0

760

Philipp Fränken Retweeted

A

Alex Tamkin@AlexTamkin · Dec 12

How are AI Assistants being used in the real world? Our new research shows how to answer this question in a privacy preserving way, automatically identifying trends in Claude usage across the world. 1/

3

26

157

115

35.0K

Philipp Fränken Retweeted

P

Philipp Fränken@jphilippfranken · Apr 23, 2024

Constitutional AI showed LMs can learn to follow constitutions by labeling their own outputs. But why can't we just tell a base model the principles of desired behavior and rely on it to act appropriately? Introducing SAMI: Self-Supervised Alignment with Mutual Information!

3

35

154

138

80.0K

P

Philipp Fränken@jphilippfranken · Dec 12

If you're at NeurIPS, come tomorrow for the Oral+Poster on "Learning Formal Mathematics from Intrinsic Motivation"! Really fun work with @DavidKarlBroman @nickhaber @noahdgoodman that put together much of what I did in the past years, w/ a new twist with open-ended learning!

NNick Haber@nickhaber · Dec 11

Excited that @GabrielPoesia will be presenting his Oral on Learning Formal Mathematics From Intrinsic Motivation. We make and prove conjectures from scratch, without any human data, by learning what is hard but provable. Gabe’s on the job market, btw. neurips.cc/virtual/2024/o…

1

19

81

21

13.0K