Prithvijit
@prithvijitch
Research Scientist @Nvidia; CS PhD @ICatGT; CV / ML; Prev: @allen_ai , @MSFTResearch, @virginia_tech | Views are my own
Check out Cosmos-Reason1, a reasoning VLM from our team for - Physical Commonsense Reasoning (spatial, temporal, intuitive physics) - Embodied Reasoning (verifying task completion, action affordance and next plausible action prediction) Models, data curation and benchmarks…
#CVPR2025 WorldModelBench submission deadline extended to Apr 15! We're inviting submissions on: - Methods for developing world (and video) models - World (and video) model downstream applications - Novel metrics, benchmarks or datasets - Analysis of safety and bias…
Join us at the WorldModelBench workshop at #CVPR2025 where we'll tackle systematic evaluation of World Models! Focus: benchmarks, metrics, downstream tasks, and safety. Submit papers now: worldmodelbench.github.io
Catch our #CVPR2025 poster today! 🖼️ “A Comprehensive Study of Decoder-Only LLMs for Text-to-Image Generation” 📍 Exhibit Hall D, Poster #230 🕓 4:00–6:00 PM We explore how LLMs perform as text encoders for image generation—with some interesting findings! 🔗 Webpage:…
🚀 Introducing Cosmos-Predict2! Our most powerful open video foundation model for Physical AI. Cosmos-Predict2 significantly improves upon Predict1 in visual quality, prompt alignment, and motion dynamics—outperforming popular open-source video foundation models. It’s openly…
Cosmos-Reason1 has exciting updates 💡 Now it understands physical reality — judging videos as real or fake! Check out the resources👇 Paper: arxiv.org/abs/2503.15558 Huggingface: huggingface.co/nvidia/Cosmos-… Code: github.com/nvidia-cosmos/… Project page: research.nvidia.com/labs/dir/cosmo… (1/n)
🚀Excited to share our new LLM math reasoning work! 🔥Supervised learning (as a replacement for RL) can reach SoTA performance on LLM math reasoning! 📊
Is self-improvement exclusive to RL? Can we use supervised learning to match LLMs trained with SOTA RL algorithms? In Negative-aware Fine-Tuning (NFT), we introduce a purely supervised learning method to enhance LLMs' math reasoning with no external teachers. NFT matches or…
Is self-improvement exclusive to RL? Can we use supervised learning to match LLMs trained with SOTA RL algorithms? In Negative-aware Fine-Tuning (NFT), we introduce a purely supervised learning method to enhance LLMs' math reasoning with no external teachers. NFT matches or…
Cosmos-Reason1 code, model and some data is out! Let us know if there's any feedback.
We released Cosmos-Reason1 code, model, and part of the data! We also updated our paper to include a section about our RL infra: arxiv.org/abs/2503.15558 - Code: github.com/nvidia-cosmos/… - Model and Data: huggingface.co/collections/nv… - Blog: developer.nvidia.com/blog/curating-…
We launched OpenMemory MCP. Check out the announcement 👇 x.com/taranjeetio/st…
We’re excited to launch OpenMemory MCP, a private memory for MCP-compatible clients powered by @mem0ai Today, most AI assistants and dev tools operate without memory. You plan your roadmap in Claude, implement tasks in Cursor, but none of them know what the other did. Each tool…
We’re excited to launch OpenMemory MCP, a private memory for MCP-compatible clients powered by @mem0ai Today, most AI assistants and dev tools operate without memory. You plan your roadmap in Claude, implement tasks in Cursor, but none of them know what the other did. Each tool…
We're excited to announce our latest advancement in building production-ready AI Agents with scalable long-term memory. Mem0 outperformed six leading baselines across diverse tasks on the LOCOMO benchmark - from single-hop and multi-hop reasoning, to temporal and open-domain…
Cameras are key to modeling our dynamic 3D visual world. Can we unlock the 𝘥𝘺𝘯𝘢𝘮𝘪𝘤 3𝘋 𝘐𝘯𝘵𝘦𝘳𝘯𝘦𝘵?! 🌎 📸 𝗗𝘆𝗻𝗣𝗼𝘀𝗲-𝟭𝟬𝟬𝗞 is our answer! @_crockwell has curated Internet-scale videos with camera pose annotations for you 🤩 Download: huggingface.co/datasets/nvidi…
Ever wish YouTube had 3D labels? 🚀Introducing🎥DynPose-100K🎥, an Internet-scale collection of diverse videos annotated with camera pose! Applications include camera-controlled video generation🤩and learned dynamic pose estimation😯 Download: huggingface.co/datasets/nvidi…
Ever wish YouTube had 3D labels? 🚀Introducing🎥DynPose-100K🎥, an Internet-scale collection of diverse videos annotated with camera pose! Applications include camera-controlled video generation🤩and learned dynamic pose estimation😯 Download: huggingface.co/datasets/nvidi…
🚨🚨 Paper submission deadline extended to May 4. Submit your work (in-progress or complete!) to the EMACS workshop @CVPR2025 in Nashville! Submission link: tinyurl.com/emacs2025submit #CVPR2025 #GenerativeAI #bias
🚀 Excited about how generative AI can power experimental (not just observational) audits of ML systems that reveal actionable insights into performance and bias? Join us at the first-ever EMACS workshop @CVPR2025 in Nashville! 🌟 Speakers & submissions: sites.google.com/view/emacs2025/
Build your personal AI study coach - powered by memory that knows you - tracks your learning journey over time - remembers the topics you struggle with - prompts you to revisit concepts periodically - understands your PDFs and notes built using @OpenAI Agents SDK + Mem0
who wants to tell him
Christopher Nolan's 'The Odyssey' Is a 'Masterpiece That Homer Himself Would Likely Be Proud Of,' Universal Executive Declares variety.com/2025/film/news…