Harshit Sikchi (will be at RLC 25)
@harshit_sikchi
Research at @openai; Reinforcement Learning; PhD from UT Austin. Previously FAIR Paris @AIatMeta, @CMU_Robotics @NVIDIAAI @UberATG.
Behavioral Foundation Models (BFMs) trained with RL are secretly more powerful than we think. BFM’s directly output a policy believed to be near-optimal given any reward function. Our new work shows that they can actually do much better:
Seeing A.R. Rahman in the office: cool perks of the job :D
It was a pleasure meet @sama at his office …we discussed “Secret Mountain”, our virtual global band, and to empower and uplift Indian minds to use AI tools to address generational challenges and lead the way forward. EPI @chatgptindia @OpenAI #arrimmersiveentertainment…
This will be a fantastic team!
I’m building a new team at @GoogleDeepMind to work on Open-Ended Discovery! We’re looking for strong Research Scientists and Research Engineers to help us push the frontier of autonomously discovering novel artifacts such as new knowledge, capabilities, or algorithms, in an…
A big milestone;🥇in IMO under same human rules and GPT-5 ☕️
1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).
✈️Heading to ICML #ICML2025 . I ll be presenting our work on unsupervised representation for RL (proto successor measures) Happy to meet people and chat about anything representations, unsupervised RL and LLMs!
Still in stealth but our team has grown to 20 and we're still hiring. If you're interested in joining the research frontier of deploying RL+LLM systems, shoot me an email to chat at ICML!
I and @agsidd10 ll be at #ICML2025 to present our work on pretraining the right representations for decision making. Come to discuss with us what should be the right objective for unsupervised RL!
What if I told you all solutions for RL lie on a (hyper) plane? Then, we can use that fact to learn a compressed representation for MDP that unlocks efficient policy inference for any reward fn. On this plane, solving RL is equivalent to solving a linear constrained optimization!
New paper on RL with a diffusion/flow policy! The idea is really a one-liner: train a new policy that outputs noises (z) as actions. Check out Andrew's thread below for more details! I'm leaving additional remarks on algorithms in this thread ↓
Diffusion policies have demonstrated impressive performance in robot control, yet are difficult to improve online when 0-shot performance isn’t enough. To address this challenge, we introduce DSRL: Diffusion Steering via Reinforcement Learning. (1/n) diffusion-steering.github.io
Interested in deploying real robots in open-world, outdoor environments? Come to our presentation this Tuesday at 9:30AM, poster #12 @USC to learn how we master outdoor navigation with internet scale data and human-in-the-loop feedback! #RSS2025 @RoboticsSciSys
🗺️ Scalable mapless navigation demands open-world generalization. Meet CREStE: our SOTA navigation model that nails path planning in novel scenes with just 3 hours of data, navigating 2 Km with just 1 human intervention. Project Page 🌐: amrl.cs.utexas.edu/creste A thread 🧵
Highly recommended!
We now know RL agents can zero-shot crush driving benchmarks. Can we put them on a car and replace the planning stack? We're hiring a postdoc at NYU to find out! Email me if interested and please help us get the word out.
I am not at RSS but go meet Arthur at #RSS2025 and learn how we were able to build a SOTA navigation model that was able to do 2 km unseen urban navigation with just 3 hours of data! Main presentation at: 24 June, 9:30 AM and will also be at the Hitl-RL workshop.
🗺️ Scalable mapless navigation demands open-world generalization. Meet CREStE: our SOTA navigation model that nails path planning in novel scenes with just 3 hours of data, navigating 2 Km with just 1 human intervention. Project Page 🌐: amrl.cs.utexas.edu/creste A thread 🧵
Exciting PhD position open at FAIR in Paris. We are looking for a candidate to join our team and contribute to advancing the field of AI, especially reinforcement learning. Find more details and apply below. Feel free to reach out to me by email. metacareers.com/jobs/192266079…
1 day till the deadline for RLBrew workshop @RL_Conference !
Reminder! The deadline in coming up in 5 days. Submit your works soon. You can submit your under-review NeurIPS papers too.