Paul Liang (@pliang279)

Pinned

P

Paul Liang@pliang279 · Feb 17

This spring I am teaching a new class at MIT called **How to AI (Almost) Anything** Its name is a play on 2 seminal @medialab courses: how to make almost anything (on design & fabrication) and how to grow almost anything (on synthetic biology) We are now in the AI age, and…

pliang279's tweet image. This spring I am teaching a new class at MIT called **How to AI (Almost) Anything**

Its name is a play on 2 seminal @medialab courses: how to make almost anything (on design &amp; fabrication) and how to grow almost anything (on synthetic biology)

We are now in the AI age, and…

58

481

3.0K

213.0K

Pinned

Paul Liang Retweeted

e

elvis@omarsar0 · Jun 23

This paper is impressive! It introduces a clever way of keeping memory use constant regardless of task length. Great use of RL for AI agents to efficiently use memory and reasoning. Here are my full notes:

9

134

859

1.0K

93.0K

Pinned

P

Paul Liang@pliang279 · Jun 11

I am very excited about David's @ddvd233 line of work in developing generalist multimodal clinical foundation models. CLIMB (which will be presented at ICML 2025) github.com/DDVD233/climb is a large-scale benchmark comprising 4.51 million patient samples totaling 19.01 terabytes…

ddvd@Miyako@ddvd233 · Jun 3

Thanks @iScienceLuvr for posting about our recent work! We're excited to introduce QoQ-Med, a multimodal medical foundation model that jointly reasons across medical images, videos, time series (ECG), and clinical texts. Beyond the model itself, we developed a novel training…

1

3

20

4

3.0K

P

Paul Liang@pliang279 · Jul 23

Building AI reasoning models with extremely long context lengths - think days, weeks, even years of context - is the next big challenge in AI. that's why i'm extremely excited about the latest work from Ao Qu @ao_qu18465, incoming PhD student in our group, on MEM1: RL for Memory…

AAo Qu@ao_qu18465 · Jul 23

🚀 Excited to share my first tweet and to introduce our latest work: MEM1: RL for Memory Consolidation in Long-Horizon Agents. Long-horizon agents (e.g., deep research, web agents) typically store all observations, actions, and intermediate thoughts in context. However, much of…

0

9

76

33

6.0K

Paul Liang Retweeted

A

Ao Qu@ao_qu18465 · Jul 23

🚀 Excited to share my first tweet and to introduce our latest work: MEM1: RL for Memory Consolidation in Long-Horizon Agents. Long-horizon agents (e.g., deep research, web agents) typically store all observations, actions, and intermediate thoughts in context. However, much of…

3

9

41

26

9.0K

Paul Liang Retweeted

A

Akari Asai@AkariAsai · Jul 15

Some updates 🚨 I finished my Ph.D at @uwcse in June 2025! After a year at AI2 as a Research Scientist, I am joining CMU @LTIatCMU & @mldcmu (courtesy) as an Assistant Professor in Fall 2026. The journey, acknowledgments & recruiting in 🧵

113

61

1.0K

107

103.0K

P

Paul Liang@pliang279 · Jul 15

We will present the work TODAY at 4:30 PM at West Hall #421 with a huge poster! Come visit us!

ddvd@Miyako@ddvd233 · Mar 12

Excited to share our latest benchmark: CLIMB, where we built a solid data foundation for multimodal clinical models. With 4.51M patient samples, totaling 19.01 TB of data across 13 domains, it's currently the largest public clinical benchmark! Paper: arxiv.org/abs/2503.07667 Code:…

6

5

54

3

10.0K

Paul Liang Retweeted

J

Jaedong Hwang@jaedong_hwang · Jul 15

🧵1/10 LLMs can answer in many languages. But do they think in them? Even when prompted in Swahili or Thai, models often switch to English for reasoning. This breaks interpretability and trust. So we ask: Can LLMs reason in the input language?

2

8

29

6

3.0K

Paul Liang Retweeted

M

MIT Media Lab@medialab · Jul 10

We’re thrilled to announce that Media Lab alumni Karrie Karahalios (@kkarahal) and Pat Pataranutaporn (@patpat_mit) are joining the Media Lab faculty on September 1! media.mit.edu/posts/mit-medi…

2

7

53

5

5.0K

Paul Liang Retweeted

L

Lily Chen@lilyychenn · Jul 1

Are we fact-checking medical claims the right way? 🩺🤔 Probably not. In our study, even experts struggled to verify Reddit health claims using end-to-end systems. We show why—and argue fact-checking should be a dialogue, with patients in the loop arxiv.org/abs/2506.20876 🧵1/

1

5

24

3

4.0K

Paul Liang Retweeted

Y

Yutong Bai@YutongBAI1002 · Jun 27

What would a World Model look like if we start from a real embodied agent acting in the real world? It has to have: 1) A real, physically grounded and complex action space—not just abstract control signals. 2) Diverse, real-life scenarios and activities. Or in short: It has to…

32

123

503

325

150.0K

P

Paul Liang@pliang279 · Jun 24

check out our growing open-source contribution MultiNet v0.2 - a comprehensive open-source benchmark for training and evaluating multimodal vision-language-action models on agentic and embodied tasks. think multimodal robotics and AI agent platforms - but with all data…

hharsh@HarshSikka · Jun 24

Incredibly excited to announce the release of MultiNet v0.2 - a major update to our comprehensive open-source benchmark suite for evaluating Multimodal Models on Action tasks. Read on for several paper announcements, details on the evaluation harness and platform, and more!…

1

4

14

3

2.0K

Paul Liang Retweeted

h

harsh@HarshSikka · Jun 24

Incredibly excited to announce the release of MultiNet v0.2 - a major update to our comprehensive open-source benchmark suite for evaluating Multimodal Models on Action tasks. Read on for several paper announcements, details on the evaluation harness and platform, and more!…

3

11

44

8

6.0K

Paul Liang Retweeted

M

Megan Tjandrasuwita@mmtjandrasuwita · Jun 21

Most problems have clear-cut instructions: solve for x, find the next number, choose the right answer. Puzzlehunts don’t. They demand creativity and lateral thinking. We introduce PuzzleWorld: a new benchmark of puzzlehunt problems challenging models to think creatively.

3

7

19

6

2.0K

Paul Liang Retweeted

M

MIT Media Lab@medialab · Jun 17

Led by Prof. @pliang279, the Multisensory Intelligence group at the MIT Media Lab studies the foundations of multisensory artificial intelligence to create human-AI symbiosis across scales and sensory mediums. The group’s members draw upon their multidisciplinary backgrounds to…

1

7

28

9

5.0K

P

Paul Liang@pliang279 · Jun 18

Despite much progress in AI, the ability for AI to 'smell' like humans remains elusive. Smell AIs 🤖👃can be used for allergen sensing (e.g., peanuts or gluten in food), hormone detection for health, safety & environmental monitoring, quality control in manufacturing, and more.…

7

17

132

43

15.0K

P

Paul Liang@pliang279 · Jun 11

Lots of interest in AI reasoning, but most use cases involve structured inputs (text) with automatic and objective verifiers (e.g. coding, math). @lmathur_'s latest work takes an ambitious step towards social reasoning in AI, a task where inputs are highly multimodal (verbal and…

LLeena Mathur@lmathur_ · Jun 10

Future AI systems interacting with humans will need to perform social reasoning that is grounded in behavioral cues and external knowledge. We introduce Social Genome to study and advance this form of reasoning in models! New paper w/ Marian Qian, @pliang279, & @lpmorency!

1

4

18

4

2.0K

Paul Liang Retweeted

L

Leena Mathur@lmathur_ · Jun 10

Future AI systems interacting with humans will need to perform social reasoning that is grounded in behavioral cues and external knowledge. We introduce Social Genome to study and advance this form of reasoning in models! New paper w/ Marian Qian, @pliang279, & @lpmorency!

2

13

40

6

5.0K

P

Paul Liang@pliang279 · Jun 3

Thanks @iScienceLuvr for posting about our recent work! We're excited to introduce QoQ-Med, a multimodal medical foundation model that jointly reasons across medical images, videos, time series (ECG), and clinical texts. Beyond the model itself, we developed a novel training…

TTanishq Abraham is at ICML@iScienceLuvr · Jun 3

QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training "we introduce QoQ-Med-7B/32B, the first open generalist clinical foundation model that jointly reasons across medical images, time-series signals, and text reports. QoQ-Med is trained with…

6

15

67

10

15.0K

Paul Liang Retweeted

T

Tanishq Abraham is at ICML@iScienceLuvr · Jun 3

QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training "we introduce QoQ-Med-7B/32B, the first open generalist clinical foundation model that jointly reasons across medical images, time-series signals, and text reports. QoQ-Med is trained with…

5

27

110

48

15.0K