Rohan Pandey

@khoomeik

RLpilled simulationmaxxer || prev research @OpenAI @CarnegieMellon '23

San Francisco, CA

Joined February 2020

2KFollowing

36KFollowers

Pinned

Rohan Pandey@khoomeik · May 5

I’ve left OpenAI! Already miss everyone on the Training team & my friends ❤️ but very excited to soon announce what’s next Until then, I’ll be taking a break to solve OCR for Sanskrit so we can immortalize the classical Indian literary canon in the weights of superintelligence

khoomeik's tweet image. I’ve left OpenAI!

Already miss everyone on the Training team &amp; my friends ❤️ but very excited to soon announce what’s next

Until then, I’ll be taking a break to solve OCR for Sanskrit so we can immortalize the classical Indian literary canon in the weights of superintelligence

487

500

13.0K

1.0K

1.6M

Rohan Pandey@khoomeik · 3 h

much more empirical/interpy work needed to understand why RL with CoT is so much better than without (not looking for theoretical explanations like test-time scaling expressivity or latent variable expectation maximization)

AAlexander Doria@Dorialexander · 11 h

pictured: non-reasoning model doing non-reasoning things

4.0K

Rohan Pandey@khoomeik · 8 h

guys cmon pika is chill they're smart people they did not deserve to get ratioed fr tho sorry friends @ pika, didn't mean to send hate your way, but please do consider building civilization-enriching experiences and not "the child-eating short-form video blackhole" :)

khoomeik's tweet image. guys cmon pika is chill they're smart people they did not deserve to get ratioed

fr tho sorry friends @ pika, didn't mean to send hate your way, but please do consider building civilization-enriching experiences and not "the child-eating short-form video blackhole" :)

4.0K

Rohan Pandey@khoomeik · 8 h

i hate when people familiar with the discourse but who've never run an experiment in their life posture about "reading arxiv papers" or "doing research" i think i see a bit of my college freshman self in them

khoomeik's tweet image. i hate when people familiar with the discourse but who've never run an experiment in their life posture about "reading arxiv papers" or "doing research"

i think i see a bit of my college freshman self in them

267

11.0K

Rohan Pandey Retweeted

Miles Brundage@Miles_Brundage · Jul 22

Prime Intellect’s “staff tweeting good stuff” game rivals the peak of OpenAI’s (which coincides with when I was there ofc) + may currently be SOTA

200

24.0K

Rohan Pandey@khoomeik · 19 h

lots of alpha in retweeting old ilya/gdb bangers and baiting the cracked oldheads to come out of the woodwork and drop lore like this

SShane Gu@shaneguML · 24 h

Why I decided to do RL in 2016 after trying out MuProp on training a Neural Programmer and backprop failed me

8.0K

Rohan Pandey Retweeted

Greg Brockman@gdb · Jan 31, 2019

For differentiable problems, there’s backpropagation. For everything else, there’s RL.

696

Rohan Pandey Retweeted

Ilya Sutskever@ilyasut · Jan 30, 2019

ML is like alchemy, turning silicon into spirits and ghosts

578

Rohan Pandey@khoomeik · Oct 15, 2019

RL can train an NN policy to do pretty much anything in sim. With Automatic Domain Randomization, we can bridge the sim2real gap by training the policy to be so adaptible, that it can generalize to the physical robot. The result: it solves the Rubik's cube with a robot hand!

OOpenAI@OpenAI · Oct 15, 2019

We've trained an AI system to solve the Rubik's Cube with a human-like robot hand. This is an unprecedented level of dexterity for a robot, and is hard even for humans to do. The system trains in an imperfect simulation and quickly adapts to reality: openai.com/blog/solving-r…

471

Rohan Pandey Retweeted

Ilya Sutskever@ilyasut · Apr 30, 2022

nobody escapes the physics police

338

Rohan Pandey Retweeted

Ilya Sutskever@ilyasut · Sep 21, 2022

huge ideas that promise near-infinite gain in theory often lead to colossal disasters in practice

165

Rohan Pandey Retweeted

Ilya Sutskever@ilyasut · Oct 1, 2020

The Bitter lesson does not say to not bother with methods research. It says to not bother with methods that are handcrafted datapoints in disguise.

563

124

Rohan Pandey@khoomeik · Jul 22

btw the “k” in kimi k2 stands for “keller”

3.0K

Rohan Pandey@khoomeik · Jul 20

if you think the bitter lesson is invalidated by the “data wall”, please observe that we have compute->data transmogrifiers they’re called simulators

195

13.0K