DG.
@dataghees
scaling speech native LLMs @rimelabs the future is willed into existence. bioML, discovering new science, housing, local politics.
This is amazing. Ever since I heard @balajis talk about the pseudonymous economy, I've been thinking about how to make it easier to start a YouTube channel with total privacy. Recent advances in deepfakes, AR and CV will make this as easy as a click. You won't need a 30k suit
This is Miko. She's a virtual streamer who is controlled by a real-life woman known only as The Technician. The Technician uses the Unreal Engine and a $30,000 motion-capture suit to create Miko. Thread 👇
this is exactly right. was a big motivation for me to do a lecture series
Announcing The Toronto School Of Foundation Modelling, a Toronto exclusive, in-person only school for learning to build Foundation Models. Coming to New Stadium and Youthful Vengeance in late August 2025.
clean torch native multimodal library, forked from torch-titan. Train your own models! (TYOM) https://github[dot]com/xingchensong/TouchNet/tree/main
Slides for my lecture “LLM Reasoning” at Stanford CS 25: dennyzhou.github.io/LLM-Reasoning-… Key points: (1) Reasoning in LLMs just means generating a series of intermediate tokens before producing the final answers. It does not matter if it is like human reasoning or not. The key point…
> train modality specific encoder > expand tokenizer > predict next token > ??? > profit
amazing turnout.
📣 AI Lecture Series continues with BIOREASON by @adibvafa He'll talk about: * The first reasoning model made for biology * How that will change the future of pharmaceutical and biomaterial discovery * The biotech ecosystem in Toronto 🗓️ Wed July 23 🎟️ Link below
The Bitter lesson does not say to not bother with methods research. It says to not bother with methods that are handcrafted datapoints in disguise.
New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵
do people have specific examples?
It’s often far easier for a Spanish company to sell products in Germany than it is for a Nova Scotia based company to sell products in Manitoba. Absolutely insane behaviour.
📣 Excited to share our real-world study of an LLM clinical copilot, a collab between @OpenAI and @PendaHealth. Across 39,849 live patient visits, clinicians with AI had a 16% relative reduction in diagnostic errors and a 13% reduction in treatment errors vs. those without. 🧵
The Invisible Leash: Why RLVR May Not Escape Its Origin "RLVR is constrained by the base model's support-unable to sample solutions with zero initial probability-and operates as a conservative reweighting mechanism that may restrict the discovery of entirely original solutions"…
#BREAKING: Toronto Blue Jays break 40-year record for most consecutive home game wins cp24.com/news/sports/20…
Congrats to the GDM team on their IMO result! I think their parallel success highlights how fast AI progress is. Their approach was a bit different than ours, but I think that shows there are many research directions for further progress. Some thoughts on our model and results 🧵
Kimi K2 paper dropped! describes: - MuonClip optimizer - large-scale agentic data synthesis pipeline that systematically generates tool-use demonstrations via simulated and real-world environments - an RL framework that combines RLVR with a self- critique rubric reward mechanism…
Advanced version of Gemini Deep Think (announced at #GoogleIO) using parallel inference time computation achieved gold-medal performance at IMO, solving 5/6 problems with rigorous proofs as verified by official IMO judges! Congrats to all involved! deepmind.google/discover/blog/…
It’s called Pakistani Pizza (Pizza Karachi on Queen W)
Chicken Tikka Masala Pizza should be a thing. I’m sure someone’s made it, but I’ve never had it and I want it.
A striking thing about OpenAI's IMO gold math model is how terse it is, it really tries to express itself in single tokens. Often breaking the rules of grammar and spelling to do so. They say compression is intelligence. We may be seeing a totally novel way to do compression…
So why is this impressive? This is pretty different from normal reasoning model training, where correct answers are rewarded: it's typically much easier to guess the answer to an IMO question than to prove it - eg guessing correctly might be 1/7 marks, proving is 7/7