Sherry Yang
@sherryyangML
Research Scientist @GoogleDeepMind. Previously PhD @UCBerkeley, M.Eng. / B.S. @MIT.
๐ค๐ We are organizing a workshop on Robotics World Modeling at @corl_conf 2025! We have an excellent group of speakers and panelists, and are inviting you to submit your papers with a July 13 deadline. Website: robot-world-modeling.github.io
Wrapped up Stanford CS336 (Language Models from Scratch), taught with an amazing team @tatsu_hashimoto @marcelroed @neilbband @rckpudi. Researchers are becoming detached from the technical details of how LMs work. In CS336, we try to fix that by having students build everything:
๐ฌ#CVPR2025 ๐๐ฎ๐ญ๐จ๐ซ๐ข๐๐ฅ ๐บ๏ธ๐ญ๐๐๐ ๐ฝ๐๐ ๐๐ ๐ฎ๐๐๐๐๐๐๐๐๐ ๐๐ ๐พ๐๐๐๐ ๐ด๐๐ ๐๐ @CVPR ๐world-model-tutorial.github.io ๐ June 11 ๐Hosted by @MMLabNTU x @Kling_ai ๐ง Incredible lineup of speakers: @jparkerholder @Koven_Yu @baaadas @wanfufeng @akanazawa @sherryyangML
Join us for a full-day tutorial on Scalable Generative Models in Computer Vision at @CVPR in Nashville, on Wednesday, June 11, from 9:00 AM to 5:00 PM in Room 202 B! ๐ We are honored to have @sainingxie, @deeptigp, @thoma_gu, Kaiming He, @ArashVahdat, and @sherryyangML toโฆ
What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision:
[1/6] Generative models can dream-up materials, but can we actually make them? We just released our preprint: System of Agentic AI for the Discovery of Metal-Organic Frameworks arxiv.org/abs/2504.14110 ๐ Thread ๐งต
๐๐๐ We are organizing a workshop on Building Physically Plausible World Models at @icmlconf 2025! We have a great lineup of speakers, and are inviting you to submit your papers with a May 10 deadline. Website: physical-world-modeling.github.io
At #NeurIPS2024. I'll present generative-materials.github.io, and talk about generative simulators, world modeling, and video agent at the D3S3 (d3s3workshop.github.io), SSL (sslneurips2024.github.io), and Open-World Agents workshops. I'm recruiting PhD students this application cycle.โฆ
Video generation models need grounding in the physical world to accurately simulate real-world dynamics. But where can they get feedback on physics? Our work shows VLMs can serve as effective judges of physical realism, enabling RL for video generation: arxiv.org/abs/2412.02617
Text-to-video models can generate photorealistic scenes but still struggle to accurately depict dynamic object interactions๐ข Our new preprint addresses this through RL finetuning with AI feedback from VLMs capable of video understanding (e.g. Gemini, etc)๐ 1/7