Yiping Lu
@2prime_PKU
Kernel, ML for PDE, Robust learning,non-parametric stats/🌈/PKU👉Stanford👉NYU Courant👉Prof.@Northwestern IEMS/ Previous Intern @RIKEN_AIP
everyone always asks who/what is adam. never how is adam
Interested in some foundation aspects? Waiting or unhappy about NeurIPS reviews? Plz consider NeurIPS workshop DynaFront: Dynamics at the Frontiers of Optimization, Sampling, and Games sites.google.com/view/dynafront… @yuejiec @Andrea__M @btreetaiji @T_Chavdarova ++ Sponsor appreciated!
Thrilled to introduce "𝗗𝗲𝗲𝗽 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵𝗲𝗿 𝘄𝗶𝘁𝗵 𝗧𝗲𝘀𝘁-𝗧𝗶𝗺𝗲 𝗗𝗶𝗳𝗳𝘂𝘀𝗶𝗼𝗻," a new deep research agent designed to mimic the iterative nature of human research, complete with cycles of planning, drafting, and revision. 🚀🚀 arxiv.org/pdf/2507.16075
Holy shit. Kimi K2 was pre-trained on 15.5T tokens using MuonClip with zero training spike. Muon has officially scaled to the 1-trillion-parameter LLM level. Many doubted it could scale, but here we are. So proud of the Moum team: @kellerjordan0, @bozavlado, @YouJiacheng,…
Our paper aims to answer two questions: 1. What's the difference between prediction and world models? 2. Are there straightforward metrics that can test this distinction? Our paper is about AI. But it's helpful to go back 400 years to answer these questions.
Can an AI model predict perfectly and still have a terrible world model? What would that even mean? Our new ICML paper formalizes these questions One result tells the story: A transformer trained on 10M solar systems nails planetary orbits. But it botches gravitational laws 🧵
📢 Call for Papers: NeurReps 2025 ‼️‼️‼️ 🧠 Submit your research on symmetry, geometry, and topology in artificial and biological neural networks. Two tracks: Proceedings (9 pages) and Extended Abstract (4 pages). Deadline: Aug 22, 2025. neurreps.org/call-for-papers
This is an amazingly well written paper, give it a read. Kind of papers you are jealous you are not on the authors' list. @marcgbellemare and the team.
And btw, the method we proposed in this work, U-MAE, is very similar in spirit to Kaiming’s recent work on Dispersive Loss, where you add a feature uniformity loss to a reconstruction loss (they add it to diffusion while we add it to MAE). We actually did this back in 2022 and…
This is the greatest paper that I have ever read, and should be an inspiration to computer scientists everywhere. The actual innovation offered in the paper is moderate, but the authors don't just show a couple graphs and tables and extrapolate. They rigorously prove from…
New position paper! Machine Learning Conferences Should Establish a “Refutations and Critiques” Track Joint w/ @sanmikoyejo @JoshuaK92829 @yegordb @bremen79 @koustuvsinha @in4dmatics @JesseDodge @suchenzang @BrandoHablando @MGerstgrasser @is_h_a @ObbadElyas 1/6
Get ready for the PDE-Transformer: our new NN architecture tailored to scientific tasks. It combines hierarchical processing (UDiT), scalability (SWin) and flexible conditioning mechanisms. The paper tum-pbs.github.io/pde-transforme… shows it outperforming existing SOTA architectures 😁
This is an interesting paper on how economists have become key players in central banks over the past decades, showing how economic expertise grew in influence and the roles economists play inside central banks (e.g. collecting data, building models).
1/ Generating transition pathways (e.g., folded ↔ unfolded protein) is a huge challenge: we tackle this by combining the scalability of pre-trained, score-based generative models and statistical mechanics insights-—no training required! To appear at #ICML2025
Information-Theoretic Proofs for Diffusion Sampling ift.tt/1T4DrOZ
Error correction is a key component of tomorrow’s large-scale quantum computers. Today we’re excited to report the experimental demonstration of a color code system that provides an advantageous alternative to the well studied surface code. Check it out →goo.gle/4ljZwli
What causes jailbreaks to transfer between LLMs? We find that jailbreak strength and model representation similarity predict transferability, and we can engineer model similarity to improve transfer. Details in🧵
BREAKING: Stanford just surveyed 1,500 workers and AI experts about which jobs AI will actually replace and automate. Turns out, we've been building AI for all the WRONG jobs. Here's what they discovered: (hint: the "AI takeover" is happening backwards)
This is the first step in a direction that I am very excited about! Using LLMs to solve scientific computing problems and potentially discover faster (or new) algorithms. #AI4Science #ML4PDEs We show that LLMs can write PDE solver code, choose appropriate algorithms, and produce…
Can LLM solve PDEs? 🤯 We present CodePDE, a framework that uses LLMs to automatically generate solvers for PDE and outperforms human implementation! 🚀 CodePDE demonstrates the power of inference-time algorithms and scaling for PDE solving. More in 🧵: #ML4PDE #AI4Science