Prateek Yadav
@prateeky2806
pre-training @AlatMeta, prev: part-time @GoogleDeepMind, PhD at @unccs
I'm on the job market! Please reach out if you are looking to hire someone to work on - RLHF - Efficiency - MoE/Modular models - Synthetic Data - Test time compute - other phases of pre/post-training. If you are not hiring then I would appreciate a retweet! More details👇
Jaemin is amazing and I would highly recommend applying for a PhD with him.
🥳 Gap year update: I'll be joining @allen_ai/@UW for 1 year (Sep2025-Jul2026 -> @JHUCompSci) & looking forward to working with amazing folks there, incl. @RanjayKrishna, @HannaHajishirzi, Ali Farhadi. 🚨 I’ll also be recruiting PhD students for my group at @JHUCompSci for Fall…
@GoogleDeepMind India 🇮🇳 & Japan 🇯🇵 are looking for strong candidates in multilinguality, multicultural, & multimodality areas. RS Bangalore: job-boards.greenhouse.io/deepmind/jobs/… RS Tokyo: job-boards.greenhouse.io/deepmind/jobs/… RE Tokyo: job-boards.greenhouse.io/deepmind/jobs/…
I've officially joined Meta Superintelligence Labs (MSL) org in the Bay Area. I'll be working on critical aspects of pre-training, synthetic data and RL for the next generation of models. Humbled and eager to contribute to the quest for superintelligence. @AIatMeta
Can someone help me to get access to this?
Comet is available today to Perplexity Max subscribers. For the wider rollout: it will be an invite-only product as we prioritize our growing waitlist. perplexity.ai/comet
We are hiring Technical Program Manager to organize and enable our research teams to be the best at what they do and to make fast-paced progress towards our mission of bringing AGI responsibly. Ideal candidates should have a demonstrable record of strong program management…
Thrilled to share that I’ll be joining the College of Computing and Data Science at Nanyang Technological University (NTU) (@NTUsg) as an Assistant Professor, starting in August 2025 🇸🇬🥳 I’ll continue my research on building trustworthy and continually adaptable multimodal AI,…
WE CAN TALK! I spent 2 hours playing with Veo 3 @googledeepmind and it blew my mind now that it can do sound! It can talk, and this is all out of the box...
Extremely excited to announce that I will be joining @UTAustin @UTCompSci in August 2025 as an Assistant Professor! 🎉 I’m looking forward to continuing to develop AI agents that interact/communicate with people, each other, and the multimodal world. I’ll be recruiting PhD…
starting soon, in hall 4 #3 see you all there and follow that thread :)
the workshop is tomorrow! and it will be livestreamed at iclr.cc/virtual/2025/w… for those not in Singapore
30+ accepted papers 6 oral papers 6 guest speakers join us at @iclr_conf on the 27th Hall 4 #3 for a full day of workshop on Modularity for Collaborative, Decentralized, and Continual Learning sites.google.com/corp/view/mcdc… @derylucio, Fengyuan Liu, and myself will be organizing…
Workshop alert 🚨 We'll host in ICLR 2025 a workshop on modularity, encompassing collaborative + decentralized + continual learning. Those topics are on the critical path to building better AIs. Interested? submit a paper and join us in Singapore! sites.google.com/corp/view/mcdc…
🚀New paper out - We present Video-MSG (Multimodal Sketch Guidance), a novel planning-based training-free guidance method for T2V models, improving control of spatial layout and object trajectories. 🔧 Key idea: • Generate a Video Sketch — a spatio-temporal plan with…
I think people shouldn't do phds anymore. Just focus on hardcore engineering / infra in a big frontier company and branch off to research if you're interested. All the research is happening in frontier labs anyway.
Excited to share my first paper as first author: "Task-Circuit Quantization” 🎉 I led this work to explore how interpretability insights can drive smarter model compression. Big thank you to @EliasEskin, @yilin_sung, and @mohitban47 for mentorship and collaboration. More to come!
🚨Announcing TaCQ 🚨 a new mixed-precision quantization method that identifies critical weights to preserve. We integrate key ideas from circuit discovery, model editing, and input attribution to improve low-bit quant., w/ 96% 16-bit acc. at 3.1 avg bits (~6x compression)…
🚨Announcing TaCQ 🚨 a new mixed-precision quantization method that identifies critical weights to preserve. We integrate key ideas from circuit discovery, model editing, and input attribution to improve low-bit quant., w/ 96% 16-bit acc. at 3.1 avg bits (~6x compression)…
I feel like Google’s TPU marketing strategy is to have a legion of ex-deepminders go out and join ai startups only to be extremely disappointed by the state of large scale GPU cluster tooling
A career peak for myself! Led physical implementation of this chip from concept to silicon debug. blog.google/products/googl…
Llama 4 Intelligence Index Update: We have now replicated Meta’s claimed values for MMLU Pro and GPQA Diamond, pushing our Intelligence Index scores for both Scout and Maverick higher Key update details: ➤ We noted in our first post 48 hours ago that we noticed discrepancies…
Yay! Maverick is #1 open model and #2 overall The future of AI is open source - Come join us! I’m hiring
BREAKING: Meta's Llama 4 Maverick just hit #2 overall - becoming the 4th org to break 1400+ on Arena!🔥 Highlights: - #1 open model, surpassing DeepSeek - Tied #1 in Hard Prompts, Coding, Math, Creative Writing - Huge leap over Llama 3 405B: 1268 → 1417 - #5 under style control…