Xavier Gonzalez
@xavierjgonzalez
PhD candidate in AI and Machine Learning at @Stanford. Advised by @scott_linderman. Parallelizing nonlinear dynamics, including RNNs. All views my own.
So excited by our latest @NeurIPSConf paper on parallelizing nonlinear RNNs! With my amazing collaborators @awarr9, @jimmysmith1919, and @scott_linderman. We are building on the beautiful DEER algorithm by YH Lim, @mfkasim, et al. (arxiv.org/abs/2309.12252). Thread below!
Did you know that you can parallelize *nonlinear* RNNs over their sequence length!? Our @NeurIPSConf paper "Towards Scalable and Stable Parallelization of nonlinear RNNs," which introduces quasi-DEER and ELK to parallelize ever larger and richer dynamical systems! 🧵 [1/11]
Finally, a dev kit for designing on-device, mobile AI apps is here: Liquid AI's LEAP venturebeat.com/business/final…
Can an AI model predict perfectly and still have a terrible world model? What would that even mean? Our new ICML paper formalizes these questions One result tells the story: A transformer trained on 10M solar systems nails planetary orbits. But it botches gravitational laws 🧵
We are excited to release our first open-weight LFM models, optimized for on-device deployments. Extremely proud of the entire team! Check them out here: huggingface.co/LiquidAI
Today, we release the 2nd generation of our Liquid foundation models, LFM2. LFM2 set the bar for quality, speed, and memory efficiency in on-device AI. Built for edge devices like phones, laptops, AI PCs, cars, wearables, satellites, and robots, LFM2 delivers the fastest…
I asked ChatGPT (o3-pro) what the most unbelievable things it's learned about humans since being created was I find no-5 and the last one (meta-surprise) quite funny 🧵 Read on 👇 1. Simultaneous brilliance and self‑sabotage Humans can design spacecraft that navigate billions…
LLMs can generate 100 answers, but which one is right? Check out our latest work closing the generation-verification gap by aggregating weak verifiers and distilling them into a compact 400M model. If this direction is exciting to you, we’d love to connect.
How can we close the generation-verification gap when LLMs produce correct answers but fail to select them? 🧵 Introducing Weaver: a framework that combines multiple weak verifiers (reward models + LM judges) to achieve o3-mini-level accuracy with much cheaper non-reasoning…
I'm interning this summer at Apple doing machine learning research! I'll be in Seattle---would love to meet up with you if you are in town! Please reach out!
My grand theory of life is to just hammer the shit out of quadrant 4
True this. Honestly, what is there to gain from having two deadlines a week apart? No reviewer action happens during that week, and we already know we'll be facing the heaviest submission load, where most reviewers are likely to max out. So having two separate PDFs (or worse, a…
Second this. No point in separate deadlines. They do more damage than good.
Can anyone explain to me why this behavior of anonymous functions makes sense in python? I really don't understand the logic...

An incredible feature. The blog post it made for my paper "Towards Scalable and Stable Parallelization of nonlinear RNNs" was in some ways better than the blog post I put a lot of time into making! alphaxiv.org/overview/2407.… lindermanlab.github.io/hackathons/
We used Mistral OCR with Claude 3.7 to create blog-style overviews for arXiv papers Generate beautiful research blogs with figures, key insights, and clear explanations from the paper with just one click Understand papers in minutes - not hours
This is an amazing class, a great way to learn cutting edge generative models like diffusion. And comes with a beautiful set of course notes diffusion.csail.mit.edu/docs/lecture-n… Tons of thanks to @peholderrieth for creating this super helpful resource!
Our MIT class “6.S184: Introduction to Flow Matching and Diffusion Models” is now available on YouTube! We teach state-of-the-art generative AI algorithms for images, videos, proteins, etc. together with the mathematical tools to understand them. diffusion.csail.mit.edu (1/4)
Recording: youtu.be/C7KnW8VFp4U Slides: asap-seminar.github.io/assets/slides/… If you're interested in this seminar series, please subscribe to our mailing list: groups.google.com/g/asap_seminar and join our Discord channel: discord.com/invite/vDaJTmK…
🚀 Announcing ASAP: asap-seminar.github.io! A fully virtual seminar bridging theory, algorithms, and systems to tackle fundamental challenges in Transformers. Co-organized by @simran_s_arora @Xinyu2ML @HanGuo97 Our first speaker: @heyyalexwang on Test-time Regression
tomorrow at 10:30 pst/1:30 est i’ll be talking at the first ASAP seminar organized by @SonglinYang4 @simran_s_arora @Xinyu2ML @HanGuo97! i’ll present recent work on a unifying framework for current sequence models like mamba, attention, etc it’s all online so come thru!
🚀 Announcing ASAP: asap-seminar.github.io! A fully virtual seminar bridging theory, algorithms, and systems to tackle fundamental challenges in Transformers. Co-organized by @simran_s_arora @Xinyu2ML @HanGuo97 Our first speaker: @heyyalexwang on Test-time Regression
Excited to share that our latest research on parallelizing nonlinear RNNS is being featured on the arXiv discussion forum @askalphaxiv. I will be on alphaXiv to answer any questions you have on the paper. alphaxiv.org/abs/2407.19115
did you know you've been doing test-time learning this whole time? transformers, SSMs, RNNs, are all test-time regressors but with different design choices we present a unifying framework that derives sequence layers (and higher-order attention👀) from a *single* equation 🧵