Puneesh Deora
@puneeshdeora
PhD student at UBC. Working on the foundations of LLMs and theory of DL. Loves memes :)
🚨 New paper drop! 🚨 🤔 When a transformer sees a sequence that could be explained by many rules, which rule does it pick? It chooses the simplest sufficient one! 🧵👇

At least we know it's written by a human
Anyone knows adam?
When did we decide to call 7B parameter models small scale? A 2 layer, 2 heads, 64 embedding dimension model is what I call a small scale model.
Swing by HiLD at #ICML2025 today to know more about our (ongoing) work on✨generalization of Shampoo/Muon vs. GD✨🔎 ✒️With @puneeshdeora and Christos Thrampoulidis.
Haven’t been to that many conferences, but I’ll say it anyway, Vancouver might be the GOAT venue. #ICML2025
When people at Meta see someone walk in knowing calculus, linear algebra, and probability theory
See below on what Zuckerberg is looking for in star recruits worth $100m pay packages for Meta’s plans in Artificial Intelligence. But weren’t some people saying calculus is no longer useful in the AI age? 🤔
People move to SF and the mildest take turns into a 2000 word manifesto.
Idek what AGI is. Seriously, can someone define it for me please?
Are there still any AI experts who think we won’t achieve AGI soon?
Shannon's Master's Thesis laid the groundwork for digital circuits; what are you trying to pull here