Roei Herzig (@roeiherzig)

Pinned

R

Roei Herzig@roeiherzig · Feb 24

What happens when vision🤝 robotics meet? Happy to share our new work on Pretraining Robotic Foundational Models!🔥 ARM4R is an Autoregressive Robotic Model that leverages low-level 4D Representations learned from human video data to yield a better robotic model. @berkeley_ai😊

9

72

434

268

60.0K

Pinned

R

Roei Herzig@roeiherzig · Jul 25

My son?😂

YYiping Lu@2prime_PKU · Jul 25

Anyone knows adam?

0

1

0

355

Pinned

R

Roei Herzig@roeiherzig · Jun 13

Yes!🥳

LLerrel Pinto@LerrelPinto · Jun 13

It was nice engaging with the CV community on ways to stand out in the crowd. My answer was simple: work on robotics. There are so many unanswered problems and open pastures for research if you are a new researcher. Below are 6 problems I focussed on in my talk.

1

0

362

R

Roei Herzig@roeiherzig · Jul 21

Thanks @IlirAliu_ for highlighting our work!🙌 🌐 Project page: arm4r.github.io 🔗 Code: github.com/Dantong88/arm4r More exciting projects on the way—stay tuned!🤖

IIlir Aliu - eu/acc@IlirAliu_ · Jul 21

Robots usually need tons of labeled data to learn precise actions. What if they could learn control skills directly from human videos… no labels needed? Robotics pretraining just took a BIG jump forward. A new Autoregressive Robotic Model, learns low-level 4D representations…

0

1

7

1

533

R

Roei Herzig@roeiherzig · Jul 20

🚀 Our code for ARM4R is now released! Check it out here 👉 github.com/Dantong88/arm4r

RRoei Herzig@roeiherzig · Feb 24

What happens when vision🤝 robotics meet? Happy to share our new work on Pretraining Robotic Foundational Models!🔥 ARM4R is an Autoregressive Robotic Model that leverages low-level 4D Representations learned from human video data to yield a better robotic model. @berkeley_ai😊

0

14

70

26

6.0K

R

Roei Herzig@roeiherzig · Jul 13

🚀 Excited that our ARM4R paper will be presented next week at #ICML2025! If you’re into 4D representations and robotic particle-based representations, don’t miss it! 🤖✨ I won’t be there in person, but make sure to stop by and chat with Yuvan! 🙌

RRoei Herzig@roeiherzig · Feb 24

What happens when vision🤝 robotics meet? Happy to share our new work on Pretraining Robotic Foundational Models!🔥 ARM4R is an Autoregressive Robotic Model that leverages low-level 4D Representations learned from human video data to yield a better robotic model. @berkeley_ai😊

1

13

60

18

11.0K

R

Roei Herzig@roeiherzig · Jul 12

Love the core message here! Predictions ≠ World Models. Predictions are task-specific, but world models can generalize across many tasks.

KKeyon Vafa@keyonV · Jul 11

Can an AI model predict perfectly and still have a terrible world model? What would that even mean? Our new ICML paper formalizes these questions One result tells the story: A transformer trained on 10M solar systems nails planetary orbits. But it botches gravitational laws 🧵

1

0

3

0

464

R

Roei Herzig@roeiherzig · Jun 28

Final chances to #ICCV2025 🌴🌺 Submit your best work to the MMFM @ ICCV workshop on all things multimodal: vision, language, audio and more. 🗓️ Deadline: July 1 🔗 openreview.net/group?id=thecv…

RRoei Herzig@roeiherzig · Jun 28

🚨 Rough luck with your #ICCV2025 submission? We’re organizing the 4th Workshop on What’s Next in Multimodal Foundation Models at @ICCVConference in Honolulu 🌺🌴 Send us your work on vision, language, audio & more! 🗓️ Deadline: July 1, 2025 🔗 sites.google.com/view/mmfm4thwo…

0

1

4

1

1.0K

R

Roei Herzig@roeiherzig · Jun 28

🚨 Rough luck with your #ICCV2025 submission? We’re organizing the 4th Workshop on What’s Next in Multimodal Foundation Models at @ICCVConference in Honolulu 🌺🌴 Send us your work on vision, language, audio & more! 🗓️ Deadline: July 1, 2025 🔗 sites.google.com/view/mmfm4thwo…

2

8

25

3

11.0K

R

Roei Herzig@roeiherzig · Jun 27

Re "vision researchers move to robotics"-they’re just returning to the field’s roots. Computer vision began as "robotic vision", focused on agents perceiving & interacting within the world. The shift to "internet vision" came later with the rise of online 2D data. Org CV book👇

YYu Xiang@YuXiang_IRVL · Jun 23

”why are so many vision / learning researchers moving to robotics?” Keynote from @trevordarrell #RSS2025

0

6

0

750

Roei Herzig Retweeted

K

Kosta Derpanis@CSProfKGD · Jun 25

Overall, I think the move from CMT to OpenReview was a great decision. Now if only we can improve the paper-reviewer matching system!

2

4

20

0

3.0K

R

Roei Herzig@roeiherzig · Jun 25

🚀 Excited to share that our latest work on Sparse Attention Vectors (SAVs) has been accepted to @ICCVConference — see you all in Hawaii! 🌸🌴 🎉 SAVs is a finetuning-free method leveraging sparse attention heads in LMMs as powerful representations for VL classification tasks.

CChancharik Mitra@chancharikm · Jan 13

🎯 Introducing Sparse Attention Vectors (SAVs): A breakthrough method for extracting powerful multimodal features from Large Multimodal Models (LMMs). SAVs enable SOTA performance on discriminative vision-language tasks (classification, safety alignment, etc.)! Links in replies!…

0

5

32

6

9.0K

R

Roei Herzig@roeiherzig · Jun 13

Honored to be named an 𝐎𝐮𝐭𝐬𝐭𝐚𝐧𝐝𝐢𝐧𝐠 𝐑𝐞𝐯𝐢𝐞𝐰𝐞𝐫 for 𝐂𝐕𝐏𝐑 𝟐𝟎𝟐𝟓 !🎉 Grateful to contribute to the community and support the high standards of the conference. Maybe it's time to start thinking about AC-ing? 🙃 #CVPR2025 @CVPR

1

0

12

0

1.0K

R

Roei Herzig@roeiherzig · Jun 12

Got a big, bold question? Let me know! Open to your questions—ambitious ones especially! 🤖💬

RRoei Herzig@roeiherzig · Jun 12

🚨 Our panel kicks off at 11:30 AM in Room 207 A–D (Level 2)! Don't miss an amazing discussion with: Ludwig Schmidt, Andrew Owens, Arsha Nagrani, and Ani Kembhavi 🔥

0

1

4

0

695

R

Roei Herzig@roeiherzig · Jun 11

Come hear @NagraniArsha speak tomorrow at the 3rd Workshop on “What is Next in Multimodal Foundation Models?” 🗓️ 9:05 AM — Talk 🗓️ 11:30 AM — Panel 📍 Room 207 A–D (Level 2) Don’t miss it! #CVPR2025 @CVPR

AArsha Nagrani@NagraniArsha · Jun 11

After a one year conference hiatus, it’s nice to be back at @CVPR! Come say hi if you are around. I’ll be speaking at the MMVM3 workshop at 9am and the EgoVis workshop at 4pm tomorrow (Thursday).

0

2

13

1

2.0K

R

Roei Herzig@roeiherzig · Jun 11

🎉 Excited to speak at the Agents in Interaction workshop at #CVPR2025 — featuring an incredible lineup of speakers! Come hear about our latest work on 𝑺𝒕𝒓𝒖𝒄𝒕𝒖𝒓𝒆𝒅 𝑷𝒉𝒚𝒔𝒊𝒄𝒂𝒍 𝑰𝒏𝒕𝒆𝒍𝒍𝒊𝒈𝒆𝒏𝒄𝒆🗓 Thursday, 2:30 PM 📍 Room 213 Don’t miss it!

YYufei Ye@yufei_ye · Jun 10

Join us for our workshop: Agents in Interaction, from Humans to Robots, on June 12th at 9:25 am, Room 213! We have an exciting line of speakers from both robotics and digital humans. Please come! @CVPR More info: agents-in-interactions.github.io

0

3

17

1

822

Roei Herzig Retweeted

M

Min-Hung (Steve) Chen@CMHungSteven · Jun 10

@CVPR is around the corner!! Join us at the Workshop on T4V at #CVPR2025 with a great speaker lineup (@MikeShou1, @jw2yang4ai, @WenhuChen, @roeiherzig, Yuheng Li, Kristen Grauman) covering diverse topics! Website: sites.google.com/view/t4v-cvpr2… #CVPR #Transformer #Vision #T4V2025 #T4V

1

19

45

1

8.0K

R

Roei Herzig@roeiherzig · Jun 10

Submit your paper to our Multimodal Foundation Models (MMFM) Workshop at ICCV in Honolulu, Hawaii

DDavid Chan@_dmchan · Jun 6

🚀 Call for Papers! 🚀 Excited to help organize the 4th Workshop on What is Next in Multimodal Foundation Models? at ICCV in Honolulu, Hawai'i 🌺 Submit work on vision, language, audio & more! 🗓️ Deadline: July 1, 2025 🔗 sites.google.com/view/mmfm4thwo… #MMFM4 #ICCV2025 #AI #multimodal

1

3

7

0

772