Rowan Zellers (@rown)

Pinned

R

Rowan Zellers@rown · May 13, 2024

Excited to introduce GPT-4o. Language, vision, and sound -- all together and all in real time. This thing has been so much fun to work on. It's been even more fun to play with -- with moments of magic where things feel totally fluid and I forget I'm video chatting with an AI.

22

35

358

38

85.0K

R

Rowan Zellers@rown · Jul 16

Excited to share that I joined @thinkymachines recently! It’s been an incredible experience so far working alongside many talented folks here. We are building multimodal AI that are collaborative with human, as well as a great research infra to accelerate AI and science!

MMira Murati@miramurati · Jul 15

Thinking Machines Lab exists to empower humanity through advancing collaborative general intelligence. We're building multimodal AI that works with how you naturally interact with the world - through conversation, through sight, through the messy way we collaborate. We're…

22

7

320

41

36.0K

R

Rowan Zellers@rown · Jul 15

We are moving incredibly fast. Come light up GPUs with us.

MMira Murati@miramurati · Jul 15

Thinking Machines Lab exists to empower humanity through advancing collaborative general intelligence. We're building multimodal AI that works with how you naturally interact with the world - through conversation, through sight, through the messy way we collaborate. We're…

13

12

339

40

38.0K

R

Rowan Zellers@rown · Jul 15

Yes - 🥳 Thinky starts hiring again: thinkingmachines.paperform.co

MMira Murati@miramurati · Jul 15

Thinking Machines Lab exists to empower humanity through advancing collaborative general intelligence. We're building multimodal AI that works with how you naturally interact with the world - through conversation, through sight, through the messy way we collaborate. We're…

13

18

534

177

67.0K

R

Rowan Zellers@rown · Jul 15

We have been working hard for the past 6 months on what I believe is the most ambitious multimodal AI program in the world. It is fantastic to see how pieces of a system that previously seemed intractable just fall into place. Feeling so lucky to create the future with this…

MMira Murati@miramurati · Jul 15

Thinking Machines Lab exists to empower humanity through advancing collaborative general intelligence. We're building multimodal AI that works with how you naturally interact with the world - through conversation, through sight, through the messy way we collaborate. We're…

9

13

299

55

45.0K

R

Rowan Zellers@rown · Jul 15

It’s really fun to work with a talented yet small team. Our mission is ambitious - multimodal AI for collaborating with humans, so the best is yet to come! Join us— or fill out the application below if interested!

MMira Murati@miramurati · Jul 15

Thinking Machines Lab exists to empower humanity through advancing collaborative general intelligence. We're building multimodal AI that works with how you naturally interact with the world - through conversation, through sight, through the messy way we collaborate. We're…

9

6

131

27

15.0K

Rowan Zellers Retweeted

M

Mira Murati@miramurati · Jul 15

Thinking Machines Lab exists to empower humanity through advancing collaborative general intelligence. We're building multimodal AI that works with how you naturally interact with the world - through conversation, through sight, through the messy way we collaborate. We're…

489

595

7.0K

2.0K

1.8M

R

Rowan Zellers@rown · Jul 15

If you’re excited to build the future of multimodal human/ai collaboration, and jam with Andrew, me, and many other talented people across the stack— dm me! 😀

AAndrew Hyunsoo Lee@alhyunsoo · Jul 15

life update: I joined @thinkymachines! feeling so lucky to build with such a kind, brilliant team, esp pairing with researchers early on as a designer. looking forward to sharing more soon.

7

5

340

152

45.0K

Rowan Zellers Retweeted

A

Andrew Hyunsoo Lee@alhyunsoo · Jul 15

life update: I joined @thinkymachines! feeling so lucky to build with such a kind, brilliant team, esp pairing with researchers early on as a designer. looking forward to sharing more soon.

43

17

852

111

115.0K

Rowan Zellers Retweeted

A

AI at Meta@AIatMeta · Jun 27

🚀New from Meta FAIR: today we’re introducing Seamless Interaction, a research project dedicated to modeling interpersonal dynamics. The project features a family of audiovisual behavioral models, developed in collaboration with Meta’s Codec Avatars lab + Core AI lab, that…

29

111

370

86

42.0K

R

Rowan Zellers@rown · Jun 15

TIL, the best (and perhaps only!!) way to speak to a human at Xfinity over phone is to say you're cancelling your service. everything else is an automated system ... that said, I learned this trick from o3, so I guess it's AI-versus-AI here

6

1

103

5

11.0K

R

Rowan Zellers@rown · Apr 27

Important PSA

1

2

21

1

3.0K

R

Rowan Zellers@rown · Apr 25

the Singapore MRT (subway) is so impressive. Many lines that go everywhere, high frequency of trains, plus it’s fully automated so it has smooth cross platform transfers. It’s safe and clean (no durians allowed). Open loop payments so you can pay by credit card…

rown's tweet image. the Singapore MRT (subway) is so impressive. Many lines that go everywhere, high frequency of trains, plus it’s fully automated so it has smooth cross platform transfers. It’s safe and clean (no durians allowed). Open loop payments so you can pay by credit card…

8

3

84

3

8.0K

Rowan Zellers Retweeted

X

Ximing Lu@GXiming · Apr 22

With the rise of R1, search seems out of fashion? We prove the opposite! 😎 Introducing Retro-Search 🌈: an MCTS-inspired search algorithm that RETROspectively revises R1’s reasoning traces to synthesize untaken, new reasoning paths that are better 💡, yet shorter in length ⚡️.

6

84

252

175

70.0K

R

Rowan Zellers@rown · Apr 16

huge congrats to @bowenc0221 for showing image-manipulation works well for vision perception, and for carrying the project all the way through to the finish line!

BBowen Cheng@bowenc0221 · Apr 16

"Thinking with Images" is what we have been cooking after GPT-4o launched last year and it marks a paradigm shift in how we view/solve perception problems in this new era of RL. It is such a pleasant and an honor to work with this amazing team to get it out!

0

1

29

3

5.0K

R

Rowan Zellers@rown · Apr 16

openai.com/index/thinking… About two years ago, we started building V* to bring visual search into a multimodal LLM and show that it's a key part of how these models can understand the world. I still remember talking with my friend @bowenc0221 and @_alex_kirillov_ about why this…

SSaining Xie@sainingxie · Jan 5, 2024

🔍Introducing V*: exploring guided visual search in multimodal LLMs MLLMs like GPT4V & LLaVA are amazing, but one concern that keeps me up at night: the (frozen) visual encoder typically extracts global image tokens *only once*, regardless of resolution or scene complexity (1/n)

2

27

250

103

45.0K

R

Rowan Zellers@rown · Apr 16

Exciting to share what i've been working on in the past few months! o3 and o4-mini are our first reasoning models with full tool support, including python, search, imagegen, etc. it also comes with the best VISUAL reasoning performance up-to-date!

OOpenAI@OpenAI · Apr 16

Introducing OpenAI o3 and o4-mini—our smartest and most capable models to date. For the first time, our reasoning models can agentically use and combine every tool within ChatGPT, including web search, Python, image analysis, file interpretation, and image generation.

20

16

314

24

35.0K

R

Rowan Zellers@rown · Apr 14

If you're going to #ICLR2025... come join me at the @thinkymachines happy hour! There will be food! (Space is limited and we can't guarantee everyone a spot, so please RSVP indicating interest)

TThinking Machines@thinkymachines · Apr 14

Thinking Machines is hosting a happy hour in Singapore during #ICLR2025 on Friday, April 25: lu.ma/ecgmuhmx Come eat, drink, and learn more about us!

3

1

106

18

14.0K

Rowan Zellers Retweeted

M

Mustafa Shukor@MustafaShukor1 · Apr 11

We release a large scale study to answer the following: - Is late fusion inherently better than early fusion for multimodal models? - How do native multimodal models scale compared to LLMs. - How sparsity (MoEs) can play a detrimental role in handling heterogeneous modalities? 🧵

8

77

432

351

69.0K

Rowan Zellers Retweeted

A

A Jabri@ajabri · Mar 25

the pros and cons

15

37

417

153

232.0K

Rowan Zellers Retweeted

A

Alisa Liu @ ICML 🚀@alisawuffles · Mar 21

We created SuperBPE🚀, a *superword* tokenizer that includes tokens spanning multiple words. When pretraining at 8B scale, SuperBPE models consistently outperform the BPE baseline on 30 downstream tasks (+8% MMLU), while also being 27% more efficient at inference time.🧵

95

329

3.0K

1.0K

361.0K