Vimal Thilak🦉🐒
@AggieInCA
Proverbs 17:28. I’m not learned. I'm AGI.
Thanks for sharing our work @_clashluke ! @samira_abnar wrote up a thread that hopefully makes it easy for folks interested in our work here x.com/samira_abnar/s…
Wake up babe New MoE scaling laws dropped
Is your AI keeping Up with the world? Announcing #NeurIPS2025 CCFM Workshop: Continual and Compatible Foundation Model Updates When/Where: Dec. 6-7 San Diego Submission deadline: Aug. 22, 2025. (opening soon!) sites.google.com/view/ccfm-neur… #FoundationModels #ContinualLearning
Friends, should I pick this up before wrapping up my India trip? 🧐
British F-35B Fighter Jet, Grounded In Kerala, On Sale On OLX? A Fact-Check ndtv.com/india-news/bri…
New Apple #ML Research Highlight: "FastVLM: Efficient Vision Encoding for Vision Language Models" machinelearning.apple.com/research/fast-…
Oh no. Hulkster gone. One of my favorite promos was one that had Kane, The Rock and The Hulkster. RIP legend.
In this report we describe the 2025 Apple Foundation Models ("AFM"). We also introduce the new Foundation Models framework, which gives app developers direct access to the on-device AFM model. machinelearning.apple.com/research/apple…
In this report, we describe the 2025 Apple Foundation Models ("AFM"). We also introduce the new Foundation Models framework, which gives app developers direct access to the on-device AFM model machinelearning.apple.com/research/apple…
New paper: 'Apple Intelligence Foundation Language Models Tech Report 2025' provides technical details for two multilingual, multimodal foundation language models that power Apple Intelligence features across Apple devices and services
In this report, we describe the 2025 Apple Foundation Models ("AFM"). We also introduce the new Foundation Models framework, which gives app developers direct access to the on-device AFM model machinelearning.apple.com/research/apple…
If you are at attending ICML today, consider checking out Samara’s poster on the role of sparsity in MoEs at 11 AM PDT. Poster ID: E-2810
🚨 One question that has always intrigued me is the role of different ways to increase a model's capacity: parameters, parallelizable compute, or sequential compute? We explored this through the lens of MoEs:
We propose new scaling laws that predict the optimal data mixture, for pretraining LLMs, native multimodal models and large vision encoders ! Only running small-scale experiments is needed, and we can then extrapolate to large-scale ones. These laws allow 1/n 🧵
Here's an Apple@ICML guide with all our talks, posters, and booth events: 🔗 machinelearning.apple.com/updates/apple-… Come say hi if you're around, always happy to chat. Looking forward to a week of great research, and catching up with familiar faces (and meeting new ones too).
Also lucky to be co-authoring two more posters during the same session with my awesome colleagues: Parameters vs FLOPs for MoEs (E-2810) with @samira_abnar, @harshays_, @alaa_nouby, Josh Susskind, and @AggieInCA 🔗 icml.cc/virtual/2025/p…
Excited to be heading to Vancouver for #ICML2025 next week! I'll be giving a deep dive on Distillation Scaling Laws at the expo — exploring when and how small models can match the performance of large ones. 📍 Sunday, July 13, 5pm, West Ballroom A 🔗 icml.cc/virtual/2025/4…
Rishabh doing stuff Vadiveku did in Kovai Brothers
It's all happening 😅 Big swing no ding from Rishabh Pant 😂
Journals typically have a “comments on …” papers for reasons described below. The authors do note that in page 7 of the paper. I wish we would get back to a saner model for publishing research.
New position paper! Machine Learning Conferences Should Establish a “Refutations and Critiques” Track Joint w/ @sanmikoyejo @JoshuaK92829 @yegordb @bremen79 @koustuvsinha @in4dmatics @JesseDodge @suchenzang @BrandoHablando @MGerstgrasser @is_h_a @ObbadElyas 1/6
Hey AI folks! 🚀We have an exciting opportunity to join the Apple Machine Learning Research team. If you’re passionate about pushing the boundaries of AI and working on cutting-edge research, we’d love to have you. Check out the role here: jobs.apple.com/en-us/details/…
Porampokku
soon grok will automatically translate foreign language posts for you
We had to downsize due to NIH funding cuts and lay off a junior software engineer who is proficient in Python coding, crawling, LLMs, RAG, and other related areas. He is currently on OPT (24 months) and will need an H1B sponsor. If any startups are interested, pls DM. RT for…
At WWDC we introduce a new generation of LLMs developed to enhance the Apple Intelligence features. We also introduce the new Foundation Models framework, which gives app developers direct access to the on-device foundation language model. machinelearning.apple.com/research/apple…