Jae Sung Park

@jjaesungPark

🎓 Phd Student at UW advised by Yejin Choi and Ali Farhadi

Joined June 2020

69Following

109Followers

Jae Sung Park Retweeted

Chenhao Zheng@Michael3014018 · Jul 8

Having trouble dealing with the excessive token number when processing a video? Check out our paper that is accepted by ICCV 2025 with an average score of 5.5! We tokenize video with tokens grounded in trajectories of all objects rather than fix-sized patches. Trained with a…

111

14.0K

Jae Sung Park@jjaesungPark · Jun 13

Check out our Molmo project in the poster session as well @CVPR!

MMatt Deitke@mattdeitke · Jun 13

Molmo won the Best Paper Honorable Mention award @CVPR! This work was a long journey over 1.5 years, from failing to get strong performance with massive scale, low quality data, to focusing on modest scale extremely high quality data! Proud to see what it became. #CVPR2025

228

Jae Sung Park Retweeted

Reza Salehi@mrezasal1 · Oct 10

Actions in specialized domains have lots of nuances and often appear similar. Can VLMs recognize these nuances in videos? 🎥🤔 Our NeurIPS D&B paper shows Gemini and GPT-4o only score 35% and 45% on complex actions in our benchmark. arxiv.org/abs/2410.05774 🧵 (1/n)

6.0K

Jae Sung Park Retweeted

Ai2@allen_ai · Sep 25

Meet Molmo: a family of open, state-of-the-art multimodal AI models. Our best model outperforms proprietary systems, using 1000x less data. Molmo doesn't just understand multimodal data—it acts on it, enabling rich interactions in both the physical and virtual worlds. Try it…

282

1.0K

640

498.0K

Jae Sung Park Retweeted

Ethan Shen@ethnlshn · Jul 3, 2024

Announcing Superposed Decoding🦸a decoding method to generate multiple completions in one LM inference pass! Superposed Decoding can power applications from code suggestions to email autocomplete. 📜: arxiv.org/abs/2405.18400 💻: github.com/RAIVNLab/Super… Here’s a quick overview👇

19.0K

Jae Sung Park@jjaesungPark · Aug 23, 2020

Can machines predict what can happen BEFORE and AFTER the image? Check out "VisualCOMET : Reasoning about the Dynamic Context of a Still Image" @ ECCV20 Spotlight @eccvconf - paper: arxiv.org/abs/2004.10796 - project page: visualcomet.xyz - live QA: 8/24 Mon 8:50 (UTC+1)

jjaesungPark's tweet image. Can machines predict what can happen BEFORE and AFTER the image? Check out "VisualCOMET : Reasoning about the Dynamic Context of a Still Image" @ ECCV20 Spotlight
@eccvconf

- paper: arxiv.org/abs/2004.10796
- project page: visualcomet.xyz
- live QA: 8/24 Mon 8:50 (UTC+1)