Sedrick Keh

@sedrickkeh2

research engineer @ToyotaResearch interested in pre-training, post-training, and multimodality

Bay Area, CA

Joined September 2022

295Following

379Followers

Pinned

Sedrick Keh@sedrickkeh2 · Jul 18

📢📢📢 Releasing OpenThinker3-1.5B, the top-performing SFT-only model at the 1B scale! 🚀 OpenThinker3-1.5B is a smaller version of our previous 7B model, trained on the same OpenThoughts3-1.2M dataset.

sedrickkeh2's tweet image. 📢📢📢 Releasing OpenThinker3-1.5B, the top-performing SFT-only model at the 1B scale! 🚀

OpenThinker3-1.5B is a smaller version of our previous 7B model, trained on the same OpenThoughts3-1.2M dataset.

112

10.0K

Pinned

Sedrick Keh@sedrickkeh2 · Jun 27

Just did a version of this! And it’s such an energizing experience. Travel award for 8 builders and engineers from the Philippines to spend a week in a retreat here in Silicon Valley. All are working on tough problems to deploy AI in a frontier market with poor infra but…

CChris Lakin@ChrisChipMonk · Jun 26

If you make >$300k/yr why aren’t you announcing random $1,000 prizes every Saturday for whatever you want to see happen in the world? $1k prize for best blog post on X, $1k or best art like Y, $1k for best _____. High agency mindset

815

Pinned

Sedrick Keh@sedrickkeh2 · Jun 6

a lot of data research to be done across all sorts of tasks and modalities!

DDimitris Papailiopoulos@DimitrisPapail · Jun 6

Some of the most impactful work you can do in academia isn’t cool new algos or novel architectures. It’s data research. Data research isn’t just dumping tokens into a json. It requires a ton of rigorous experimentation, algorithmic thinking, and actually talking to your models.

277

Sedrick Keh Retweeted

Mihir Prabhudesai@mihirp98 · Jul 22

🚨 The era of infinite internet data is ending, So we ask: 👉 What’s the right generative modelling objective when data—not compute—is the bottleneck? TL;DR: ▶️Compute-constrained? Train Autoregressive models ▶️Data-constrained? Train Diffusion models Get ready for 🤿 1/n

120

167

937

813

153.0K

Sedrick Keh@sedrickkeh2 · Jul 18

Open Thoughts delivers again. Congrats team for a small but powerful reasoning model. Writeup: open-thoughts.ai/blog/ot3_small

SSedrick Keh@sedrickkeh2 · Jul 18

2.0K

Sedrick Keh Retweeted

Sukjun (June) Hwang@sukjun_hwang · Jul 11

Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical network that replaces tokenization with a dynamic chunking process directly inside the model, automatically discovering and operating over meaningful units of data

683

5.0K

4.0K

692.0K

Sedrick Keh@sedrickkeh2 · Jul 9

🚀Thrilled to share what we’ve been building at TRI over the past several months: our first Large Behavior Models (LBMs) are here! I’m proud to have been a core contributor to the multi-task policy learning and post-training efforts. At TRI, we’ve been researching how LBMs can…

RRuss Tedrake@RussTedrake · Jul 9

TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: toyotaresearchinstitute.github.io/lbm1/ One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the…

185

18.0K

Sedrick Keh Retweeted

Russ Tedrake@RussTedrake · Jul 9

105

476

189

73.0K

Sedrick Keh Retweeted

Ted Xiao@xiao_ted · Jul 1

If you’re working on robotics and AI, the recent Stanford talk from @RussTedrake on scaling multitask robot manipulation is a mandatory watch, full stop. No marketing, no hype. Just solid hypothesis driven science, evidence backed claims. A gold mine in today’s landscape!

345

346

29.0K

Sedrick Keh@sedrickkeh2 · Jun 24

This plot is a thing of beauty. Great visualization by @MercatJean! One of many cool artifacts that arose from conducting 1000+ experiments for OpenThoughts 😀

JJean Mercat@MercatJean · Jun 24

We evaluated more than 1000 reasoning LLMs on 12 reasoning-focused benchmarks and made fascinating observations about cross-benchmark comparisons. You can explore all that data yourself on our HuggingFace spaces page. (1/4)

646

Sedrick Keh Retweeted

Thao Nguyen@thao_nguyen26 · Jun 23

Web data, the “fossil fuel of AI”, is being exhausted. What’s next?🤔 We propose Recycling the Web to break the data wall of pretraining via grounded synthetic data. It is more effective than standard data filtering methods, even with multi-epoch repeats! arxiv.org/abs/2506.04689

220

125

32.0K

Sedrick Keh@sedrickkeh2 · Jun 19

How can we achieve both common sense understanding that can deal with varying levels of ambiguity in language and dextrous manipulation? Check out CodeDiffuser, a really neat work that bridges Code Gen with a 3D Diffusion Policy! This was a fun project with cool experiments! 🤖

YYixuan Wang@YXWangBot · Jun 19

🤖 Does VLA models really listen to language instructions? Maybe not 👀 🚀 Introducing our RSS paper: CodeDiffuser -- using VLM-generated code to bridge the gap between **high-level language** and **low-level visuomotor policy** 🎮 Try the live demo: robopil.github.io/code-diffuser/ (1/9)

2.0K

Sedrick Keh Retweeted

Etash Guha@etash_guha · Jun 10

OpenThoughts3 is the #1 trending dataset on Huggingface! Thank you to everyone who is using the dataset and giving us great feedback 🚀!

4.0K

Sedrick Keh Retweeted

Shun Iwase@s1wase · Jun 9

#CVPR2025 starts in two days, and can’t wait to share our new work! 🎉 We present ZeroGrasp, a unified framework for 3D reconstruction and grasp prediction that generalizes to unseen objects. Paper📄: arxiv.org/abs/2504.10857 Webpage🌐:sh8.io/#/zerograsp (1/4 🧵)

3.0K

Sedrick Keh Retweeted

Ludwig Schmidt@lschmidt3 · Jun 5

Very excited to finally release our paper for OpenThoughts! After DataComp and DCLM, this is the third large open dataset my group has been building in collaboration with the DataComp community. This time, the focus is on post-training, specifically reasoning data.

212

1.0K

875

167.0K

Sedrick Keh@sedrickkeh2 · Jun 5

Very proud of this work and the team! Nvidia released nemotron recently which is a great open reasoning model. The OpenThinker team worked tirelessly and heroically and curated what's arguably the best reasoning data, and got the model to be better than nemotron (and gpt4.1).…

RRyan Marten@ryanmart3n · Jun 5

Announcing OpenThinker3-7B, the new SOTA open-data 7B reasoning model: improving over DeepSeek-R1-Distill-Qwen-7B by 33% on average over code, science, and math evals. We also release our dataset, OpenThoughts3-1.2M, which is the best open reasoning dataset across all data…

10.0K