Zeming Lin
@ebetica
Co-founder at EvolutionaryScale. ESM3 / ESMFold / PyTorch. Climbs all day. Unsupervised learner but sometimes still gets a few rewards. Views are my own.
ESM3 paper now out in science! Give it a read. We're also releasing widespread API access - commercial okay. No code UI tool to generate your own proteins! And compute grants for academic users doing cool projects.
Read the paper: science.org/doi/10.1126/sc… Our blog post: evolutionaryscale.ai/blog/esm3-rele…
It's August though. Claude is on vacation.
Claude Code seems to have been a bit "dumber" the past few days and just making stupid mistakes. I legit would pay $500/mo to just have the good version that never gets a quantized/degraded during peak times!
since (AI-generated) protein folding (videos) is solved, we now tackle protein conformational changes
Has anyone had any luck getting #veo3 to generate protein folding videos? here are my attempts:
The 14th Street Busway has been a smashing success: faster commutes, fewer crashes, calmer streets. So why is Eric Adams blocking a similar plan for 34th Street?
Unfortunately, the MLSB Workshop @ NeurIPS (@workshopmlsb) was rejected this year. Feedback from the deciding committee indicates it was a coin flip decision, with 283 proposals & a number related to “computational biology” More on the future of MLSB soon…
Tokenization is just a special case of "chunking" - building low-level data into high-level abstractions - which is in turn fundamental to intelligence. Our new architecture, which enables hierarchical *dynamic chunking*, is not only tokenizer-free, but simply scales better.
Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical network that replaces tokenization with a dynamic chunking process directly inside the model, automatically discovering and operating over meaningful units of data
Open source competitor to Anthropic style agentic models? Take note @cursor_ai !!
🚀 Hello, Kimi K2! Open-Source Agentic Model! 🔹 1T total / 32B active MoE model 🔹 SOTA on SWE Bench Verified, Tau2 & AceBench among open models 🔹Strong in coding and agentic tasks 🐤 Multimodal & thought-mode not supported for now With Kimi K2, advanced agentic intelligence…
This is so cool!
🚨 Did you know that small-batch vanilla SGD without momentum (i.e. the first optimizer you learn about in intro ML) is virtually as fast as AdamW for LLM pretraining on a per-FLOP basis? 📜 1/n
🎉Personal update: I'm thrilled to announce that I'm joining Imperial College London @imperialcollege as an Assistant Professor of Computing @ICComputing starting January 2026. My future lab and I will continue to work on building better Generative Models 🤖, the hardest…
Thrilled to finally share what we've been working on for months at @huggingface 🤝@pollenrobotics Our first robot: Reachy Mini A dream come true: cute and low priced, hackable yet easy to use, powered by open-source and the infinite community. Tiny price, small size, huge…
Super excited to share SmolLM3, a new strong 3B model. SmolLM3 is fully open, we share the recipe, the dataset, the training codebase and much more! > Train on 11T token on 384 H100 for 220k GPU hours > Support long context up to 128k thanks to NoPE and intra document masking >…
MCPs are not the way...
it's funny how many people wrote up huge predictions for MCP without even looking into how LLM performance degrades when you add even 10 tools to them
I also built a custom scheduler for the FAIR cluster to scavenge GPUs to fold ESMAtlas 🤦 The good old days of FAIR. Some things never change. The real question is, @ZeyuanAllenZhu, how many times did you bring down the FAIR cluster? 🙃
Facebook AI Research (FAIR) is a small, prestigious lab in Meta. We don't train large models like GenAI or MSL, so it's natural that we have limited GPUs. GenAI or MSL's success or failure, past or future, doesn't reflect the work of FAIR. It is important to make this distinction
Deep learning training is a mathematical dumpster fire. But it turns out that if you *fix* the math, everything kinda just works…fp8 training, hyperparameter transfer, training stability, and more. [1/n]
Our latest open-source speech-to-text model just claimed 1st place among streaming models and 5th place overall on the OpenASR leaderboard 🥇🎙️ While all other models need the whole audio, ours delivers top-tier accuracy on streaming content. Open, fast, and ready for production!
Today we’re releasing SAIR, the Structurally Augmented IC50 Repository. SAIR is the Largest Open-Sourced Binding Affinity Dataset with Cofolded 3D Structures. It includes more than 5 million protein-ligand structures, generated using our Large Quantitative Models and labeled…
This is a great syllabus!!
Wrapped up Stanford CS336 (Language Models from Scratch), taught with an amazing team @tatsu_hashimoto @marcelroed @neilbband @rckpudi. Researchers are becoming detached from the technical details of how LMs work. In CS336, we try to fix that by having students build everything:
Wrapped up Stanford CS336 (Language Models from Scratch), taught with an amazing team @tatsu_hashimoto @marcelroed @neilbband @rckpudi. Researchers are becoming detached from the technical details of how LMs work. In CS336, we try to fix that by having students build everything: