Qiyang Li (@qiyang_li)

Pinned

Q

Qiyang Li@qiyang_li · Jul 12

Everyone knows action chunking is great for imitation learning. It turns out that we can extend its success to RL to better leverage prior data for improved exploration and online sample efficiency! colinqiyangli.github.io/qc/ The recipe to achieve this is incredibly simple. 🧵 1/N

3

65

352

293

35.0K

Qiyang Li Retweeted

S

Sergey Levine@svlevine · Jul 21

I wrote a fun little article about all the ways to dodge the need for real-world robot data. I think it has a cute title. sergeylevine.substack.com/p/sporks-of-agi

21

112

732

599

89.0K

Qiyang Li Retweeted

A

Andrew Wagenmaker@ajwagenmaker · Jul 18

Join us on July 19th at @icmlconf, Vancouver, for the EXAIT Workshop— a full-day workshop on the role of exploration in AI today.

1

5

27

7

12.0K

Q

Qiyang Li@qiyang_li · Jul 19

Check out our poster on “Reinforcement Learning with Action Chunking” tomorrow 11:45-14:15 @ EXAIT Workshop (Meeting Room 205-207)!

QQiyang Li@qiyang_li · Jul 12

Everyone knows action chunking is great for imitation learning. It turns out that we can extend its success to RL to better leverage prior data for improved exploration and online sample efficiency! colinqiyangli.github.io/qc/ The recipe to achieve this is incredibly simple. 🧵 1/N

0

4

46

21

4.0K

Q

Qiyang Li@qiyang_li · Jul 17

Had so much fun working on this😊 PyTorch and JAX implementations are both out!

RRuilong Li@ruilong_li · Jul 15

For everyone interested in precise 📷camera control 📷 in transformers [e.g., video / world model etc] Stop settling for Plücker raymaps -- use camera-aware relative PE in your attention layers, like RoPE (for LLMs) but for cameras! Paper & code: liruilong.cn/prope/

0

7

66

23

6.0K

Qiyang Li Retweeted

A

Andrew Wagenmaker@ajwagenmaker · Jul 16

How can we train a foundation model to internalize what it means to “explore”? Come check out our work on “behavioral exploration” at ICML25 to find out!

6

50

373

288

42.0K

Qiyang Li Retweeted

P

Perry Dong@perryadong · Jul 16

Fine-tuning pre-trained robotic models with online RL requires a way to train RL with expressive policies Can we design an effective method for this? We propose EXPO, a sample-efficient online RL algorithm that enables stable fine-tuning of expressive policy classes (1/6)

1

2

11

5

556

Q

Qiyang Li@qiyang_li · Jul 16

Check out our poster 11-1:30 today @ West exhibition hall W-713!

QQiyang Li@qiyang_li · Jul 13

Excited to be in Vancouver attending ICML this week to present some papers! Jul 16 Wed 11:30-1, W-713 1) Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration: w/ @wilcoxsonmax (co-lead), @kvfrans, @svlevine x.com/qiyang_li/stat…

0

3

0

506

Q

Qiyang Li@qiyang_li · Jul 15

Check out our poster on Wednesday 4:30p-7p (West Exhibition Hall, #713)!!

SSeohong Park@seohong_park · Jul 15

Flow Q-learning (FQL) is a simple method to train/fine-tune an expressive flow policy with RL. Come visit our poster at 4:30p-7p this Wed (evening session, 2nd day)!

0

6

1

620

Qiyang Li Retweeted

M

Marwa Abdulhai@marwaabdulhai · Jul 14

I'm at ICML '25! Come check out our benchmark LMRL-Gym for multi-turn RL for LLMs at Wednesday's Poster Session. In addition to dialogue & text game tasks, we share a methodology for synthetic data generation to develop RL algorithms. Paper & code here: lmrl-gym.github.io

2

8

37

13

11.0K

Q

Qiyang Li@qiyang_li · Jul 14

@qiyang_li will help present OTTER tomorrow at #ICML2025! A lightweight, instruction-following VLA! See OG post below! 👉Code already released at ottervla.github.io Poster will be presented at West Exhibition Hall B2-B3 #W-409 Tue 15 Jul 11 a.m. PDT — 1:30 p.m. PDT

FFangchen Liu@fangchenliu_ · Mar 14

1/N Most Vision-Language-Action models need tons of data for finetuning, and still fail for new objects and instructions. Introducing OTTER, a lightweight, easy-to-train model that uses text-aware visual features to nail unseen tasks out of the box! Here's how it works 👇

0

2

7

1

641

Q

Qiyang Li@qiyang_li · Jul 12

Action chunking + expressive action distribution —> Better exploration for RL! This was one of the biggest lessons we learned in DPPO as well

PPaul Zhou@zhiyuan_zhou_ · Jul 12

Action chunking works really well in imitation learning, and is essential to learning good BC policies in robotics. Can/should we apply the same idea in RL? We find that RL in the action chunk space, when done right (we call it ✨Q-chunking ✨), can be highly efficient🧵👇

2

15

144

83

14.0K

Qiyang Li Retweeted

P

Paul Zhou@zhiyuan_zhou_ · Jul 12

Action chunking works really well in imitation learning, and is essential to learning good BC policies in robotics. Can/should we apply the same idea in RL? We find that RL in the action chunk space, when done right (we call it ✨Q-chunking ✨), can be highly efficient🧵👇

4

23

189

145

32.0K

Qiyang Li Retweeted

S

Sergey Levine@svlevine · Jul 11

Action chunking is a great idea in robotics: by getting a model to produce a short sequence of actions, it _just works better_ for some mysterious reason. Now it turns out this can help in RL too, and it's a bit clearer why: action chunks help explore and help with backups. 🧵👇

9

105

689

534

55.0K