Nikhil Parthasarathy
@nikparth1
Research Scientist @GoogleDeepMind //working on multimodal learning, video understanding, data curation. PhD from the Simoncelli lab @NYU_CNS. BS/MS @Stanford.
We've now been given permission to share our results and are pleased to have been part of the inaugural cohort to have our model results officially graded and certified by IMO coordinators and experts, receiving the first official gold-level performance grading for an AI system!
All this talk about world models but how strong are their perception abilities really? Can they track w/ occlusions, reason over 1hr+ videos, or predict physical scenarios? Test your models in the 3rd Perception Test Challenge at #ICCV2025 w/ prizes up to 50k EUR! DDL: 6 Oct 25
Nice work from my colleagues scaling self-supervised video encoders!
Scaling 4D Representations – new preprint arxiv.org/abs/2412.15212 and models now available github.com/google-deepmin…
Thrilled to share our latest work on SciVid, to appear at #ICCV2025! 🎉 SciVid offers cross-domain evaluation of video models in scientific applications, including medical CV, animal behavior, & weather forecasting 🧪🌍📽️🪰🐭🫀🌦️ #AI4Science #FoundationModel #CV4Science [1/5]🧵
The newly generally available Gemini 2.5 Flash and Pro are even better at video understanding than the versions we shared in the blog a month ago, see more details in the tech report 😀
Hot Gemini updates off the press. 🚀 Anyone can now use 2.5 Flash and Pro to build and scale production-ready AI applications. 🙌 We’re also launching 2.5 Flash-Lite in preview: the fastest model in the 2.5 family to respond to requests, with the lowest cost too. 🧵
Active Data Curation Effectively Distills Large-Scale Multimodal Models - compute per sample loss with large batch - only backprop (probabilistically) through samples with high loss intuition: these are the samples where there is “something to learn” - if both teacher and…
Stop by this amazing work from Vishaal and the team today at CVPR
Our ACID paper showing how you can use active data curation as an effective way to pretrain super-strong smol and efficient VL-encoders. Poster #361 in the Poster Hall from 10:30 AM - 12:30 PM on Saturday, 14th June x.com/vishaal_urao/s…
Sad I can't be there this time but if you're interested in active learning, data curation, distillation and more go checkout the poster Vishaal is presenting on our work today!
Our ACID paper showing how you can use active data curation as an effective way to pretrain super-strong smol and efficient VL-encoders. Poster #361 in the Poster Hall from 10:30 AM - 12:30 PM on Saturday, 14th June x.com/vishaal_urao/s…
A paper from my postdoc is out! nature.com/articles/s4146… How does the premotor cortex reuse the neural representation of single finger movements for simultaneous, multi-finger movements? TLDR: Ensemble activity for simultaneous movements can be explained by linear summation of…
While we're at "recognizing evals", here is a legendary vision paper from CVPR 2011 It shows that it's quite easy to classify which dataset an image comes from (39% acc, vs random=8%). The point being, every dataset having its distinct signature should be a default assumption.
I think this paper writes a wrong claim around otherwise valid experiments. A correct title would be "LLMs can classify whether a transcript is an eval or user interaction". Which is NOT "model knows it's being evaluated". I hope yall see the difference? If not, see reply.
Fully support taking pride in hard work but why in the world would someone want to ruin the beauty of latte art like that...
Very interesting.. is training with CoT, really about learning the "correct reasoning"? This seems to suggest no! Curious if this kind of experiment (specifically ablating the correctness of the reasoning traces during training) will be seen in other domains as well?
Do Intermediate Tokens Produced by LRMs (need to) have any semantics? Our new study "Beyond Semantics: The Unreasonable Effectiveness of Reasonless Intermediate Tokens" lead by @kayastechly, @karthikv792 @_gundawar & @PalodVardh12428 dives into this question 🧵 1/