Jyo Pari (@jyo_pari)

Pinned

J

Jyo Pari@jyo_pari · Jun 13

What if an LLM could update its own weights? Meet SEAL🦭: a framework where LLMs generate their own training data (self-edits) to update their weights in response to new inputs. Self-editing is learned via RL, using the updated model’s downstream performance as reward.

jyo_pari's tweet image. What if an LLM could update its own weights?

Meet SEAL🦭: a framework where LLMs generate their own training data (self-edits) to update their weights in response to new inputs.

Self-editing is learned via RL, using the updated model’s downstream performance as reward.

131

528

3.0K

590.0K

J

Jyo Pari@jyo_pari · Jul 16

If you are interested in questioning how we should pretrain models and create new architectures for general reasoning - then checkout E606 @ ICML, our position by @seungwookh and I on potential directions for the next generation reasoning models!

jyo_pari's tweet image. If you are interested in questioning how we should pretrain models and create new architectures for general reasoning

- then checkout E606 @ ICML, our position by @seungwookh and I on potential directions for the next generation reasoning models!

0

5

21

6

2.0K

J

Jyo Pari@jyo_pari · Jul 14

MoE Routers are trained a bit strangely but things seem to still work. @minyoung_huh and I got curious about combining specialized experts at test time through routing… and ended up deep in the weeds of MoE optimization. Here's a blog post! jyopari.github.io/posts/peculiar…

jyo_pari's tweet image. MoE Routers are trained a bit strangely but things seem to still work.

@minyoung_huh and I got curious about combining specialized experts at test time through routing… and ended up deep in the weeds of MoE optimization. Here's a blog post!

jyopari.github.io/posts/peculiar…

2

19

137

98

11.0K

J

Jyo Pari@jyo_pari · Jul 11

Current adaptive tokenizers still rely on humans to set the desired fidelity a priori. But what if the model could learn that itself? The part I like a lot about this paper beyond the high level idea is the way @ShivamDuggal4 trained for this ability. Cudos 🎇!

SShivam Duggal@ShivamDuggal4 · Jul 11

Compression is the heart of intelligence From Occam to Kolmogorov—shorter programs=smarter representations Meet KARL: Kolmogorov-Approximating Representation Learning. Given an image, token budget T & target quality 𝜖 —KARL finds the smallest t≤T to reconstruct it within 𝜖🧵

0

5

0

813

J

Jyo Pari@jyo_pari · Jun 18

Thanks @willknight for covering SEAL! Really appreciate the thoughtful and insightful way you captured the work.

WWIRED@WIRED · Jun 18

Scientists at Massachusetts Institute of Technology have devised a way for large language models to keep learning on the fly—a step toward building AI that continually improves itself. wired.com/story/this-ai-…

1

5

13

1

3.0K

Jyo Pari Retweeted

A

Adam Zweiger@AdamZweiger · Jun 14

An underrated and potentially more practical aspect of our Self-Adapting LMs paper is the potential for general pre/post-training data curation. In the paper, we focus on using the same model for both generating and learning from self-edits. In practice, I imagine a "teacher"…

4

5

33

12

4.0K

J

Jyo Pari@jyo_pari · Jun 13

There are three types of storage: activations (in-context), external memory, and model weights. If the models will spend days for a task, then they should be really good at compiling their in-context work to ab external memory or to their weights! Here we try to learn weights…

JJyo Pari@jyo_pari · Jun 13

What if an LLM could update its own weights? Meet SEAL🦭: a framework where LLMs generate their own training data (self-edits) to update their weights in response to new inputs. Self-editing is learned via RL, using the updated model’s downstream performance as reward.

5

15

203

161

25.0K

J

Jyo Pari@jyo_pari · Jun 13

it really is incredible what kinds of things become possible when RL on LLMs works. clearly we’re just getting started

JJyo Pari@jyo_pari · Jun 13

What if an LLM could update its own weights? Meet SEAL🦭: a framework where LLMs generate their own training data (self-edits) to update their weights in response to new inputs. Self-editing is learned via RL, using the updated model’s downstream performance as reward.

3

55

699

394

49.0K

J

Jyo Pari@jyo_pari · Jun 6

There is linear attention, quadratic attention, but what about something in the middle? Do we really need to attend to all tokens or can we exploit a recency bias? Check out this very inspiring work by Han!

HHan Guo@HanGuo97 · Jun 6

We know Attention and its linear-time variants, such as linear attention and State Space Models. But what lies in between? Introducing Log-Linear Attention with: - Log-linear time training - Log-time inference (in both time and memory) - Hardware-efficient Triton kernels

0

2

10

1

2.0K