Jason Yim
@json_yim
PhD student @MIT_CSAIL. Generative models, protein design. 🦋 Bluesky handle: https://bsky.app/profile/jyim.bsky.social On X until the exodus is complete.
Combining discrete and continuous data is an important capability for generative models. To address this for protein design, we introduce Multiflow, a generative model for structure and sequence generation. Preprint: arxiv.org/abs/2402.04997 Code: github.com/jasonkyuyim/mu… 1/8
The BioEmu-1 model and inference code are now public under MIT license!!! Please go ahead, play with it and let us know if there are issues. github.com/microsoft/bioe…
Super excited to preprint our work on developing a Biomolecular Emulator (BioEmu): Scalable emulation of protein equilibrium ensembles with generative deep learning from @MSFTResearch AI for Science. #ML #AI #NeuralNetworks #Biology #AI4Science biorxiv.org/content/10.110…
🎉Personal update: I'm thrilled to announce that I'm joining Imperial College London @imperialcollege as an Assistant Professor of Computing @ICComputing starting January 2026. My future lab and I will continue to work on building better Generative Models 🤖, the hardest…
We've open sourced Adjoint Sampling! It's part of a bundled release showcasing FAIR's research and open source commitment to AI for science. github.com/facebookresear… x.com/AIatMeta/statu…
Announcing the newest releases from Meta FAIR. We’re releasing new groundbreaking models, benchmarks, and datasets that will transform the way researchers approach molecular property prediction, language processing, and neuroscience. 1️⃣ Open Molecules 2025 (OMol25): A dataset…
#FPIworkshop best paper award goes to @peholderrieth @msalbergo and Tommi Jaakkola. Congrats and great talk Peter!
I won't be at ICLR 🥲 but you can talk to these other cool people at my poster, Thursday 3-5:30 PM in Hall 3+2B #10!
Excited to share my #ICLR2025 paper, with JC Hütter and friends! Genetic perturbation screens allow biologists to manipulate and measure the genes in cells = discover causal relationships! BUT they are expensive to run, expensive to interpret. ... We use LLMs to help!
Had fun exploring guidance for backbone designability within this latent framework, excited to chat more about guidance with experimental data @gembioworkshop ICLR
I'll be at the ICLR @gembioworkshop workshop presenting latent and structure diffusion for protein backbone generation. Come by to talk all things latent for biology. openreview.net/forum?id=Ek7Hs… arxiv.org/abs/2504.09374
I'll be at the ICLR @gembioworkshop workshop presenting latent and structure diffusion for protein backbone generation. Come by to talk all things latent for biology. openreview.net/forum?id=Ek7Hs… arxiv.org/abs/2504.09374
I'll be at ICLR. Come check out our generative modeling work! Reach out if you want to chat. Proteina: x.com/karsten_kreis/… Protcomposer: x.com/HannesStaerk/s… Generator matching: x.com/peholderrieth/…
New paper out! We introduce “Generator Matching” (GM), a method to build GenAI models for any data type (incl. multimodal) with any Markov process. GM unifies a range of state-of-the-art models and enables new designs of generative models. arxiv.org/abs/2410.20587 (1/5)
RFdiffusion => generative binder design. RFdiffusion2 => generative enzyme design. It's rare to find scientists with deep knowledge in chemistry, machine learning, and software engineering like Woody. The complexity of enzymes matches the complexity of his skills. Check out RFD2
New enzymes can unlock chemistry we never had access to before. Here, we introduce RFdiffusion2 (RFD2), a generative model that makes significant strides in de novo enzyme design. Preprint: biorxiv.org/content/10.110… Code: coming soon Animation credit: x.com/ichaydon (1/n)
New enzymes can unlock chemistry we never had access to before. Here, we introduce RFdiffusion2 (RFD2), a generative model that makes significant strides in de novo enzyme design. Preprint: biorxiv.org/content/10.110… Code: coming soon Animation credit: x.com/ichaydon (1/n)
Excited to share our preprint “BoltzDesign1: Inverting All-Atom Structure Prediction Model for Generalized Biomolecular Binder Design” — a collaboration with @MartinPacesa, @ZhidianZ , Bruno E. Correia, and @sokrypton. 🧬 Code will be released in a couple weeks
Protein dynamics was the first research to enchant me >10yrs ago, but I left in PhD bc I couldnt find big expt data to evaluate models. Today w @ginaelnesr, I'm thrilled to share the big dynamics data I've been dreaming of, and the mdl we trained: Dyna-1. rb.gy/de5axp
Combining prediction, generation, and modalities (sequence, structure, nucleic acids, small molecules, proteins) is the future. Congrats to the authors! Looking forward to the technical report.
Announcing Neo-1: the world’s most advanced atomistic foundation model, unifying structure prediction and all-atom de novo generation for the first time - to decode and design the structure of life 🧵(1/10)
Introducing All-atom Diffusion Transformers — towards Foundation Models for generative chemistry, from my internship with the FAIR Chemistry team @OpenCatalyst @AIatMeta There are a couple ML ideas which I think are new and exciting in here 👇
Awesome work by Hannnes and Bowen towards improved control of protein structure generation with MultiFlow!
New paper (and #ICLR2025 Oral :)): ProtComposer: Compositional Protein Structure Generation with 3D Ellipsoids arxiv.org/abs/2503.05025 Condition on your 3D layout (of ellipsoids) to generate proteins like this or to get better designability/diversity/novelty tradeoffs. 1/6
I really enjoyed seeing how protein generation models scale with more data and weights. Congrats to Nvidia and the core contributors for this amazing work!
📢📢 "Proteina: Scaling Flow-based Protein Structure Generative Models" #ICLR2025 (Oral Presentation) 🔥 Project page: research.nvidia.com/labs/genair/pr… 📜 Paper: arxiv.org/abs/2503.00710 🛠️ Code and weights: github.com/NVIDIA-Digital… 🧵Details in thread... (1/n)
🔥Excited to present *Proteina* next week (accepted as Oral to ICLR'25) in the ML Protein Engineering Seminar Series. Hear about scalable protein backbone generation, fold class guidance, and more! Also, stay tuned for the upcoming official release, including code and weights!🔥
Next Tues (3/4) at 4PM ET we will have @karsten_kreis present "Proteina: Scaling Flow-based Protein Structure Generative Models" Paper: openreview.net/forum?id=TVQLu… Sign up on our website for zoom links!
Our MIT class “6.S184: Introduction to Flow Matching and Diffusion Models” is now available on YouTube! We teach state-of-the-art generative AI algorithms for images, videos, proteins, etc. together with the mathematical tools to understand them. diffusion.csail.mit.edu (1/4)
Tired of slow diffusion models? Our new paper introduces f-distill, enabling arbitrary f-divergence for one-step diffusion distillation. JS divergence gives SOTA results on text-to-image! Choose the divergence that suits your needs. Joint work with @wn8_nie @ArashVahdat 1/N
We made a new, reproducible, fair, (and much harder) motif scaffolding benchmark! With Zhuoqi Zheng, Bo Zhang, @DidiKieran @json_yim @_JosephWatson Hai-Feng Chen, @brianltrippe
🔥 Benchmark Alert! MotifBench sets a new standard for evaluating protein design methods for motif scaffolding. Why does this matter? Reproducibility & consistent evaluation have been lacking—until now. Paper: arxiv.org/abs/2502.12479 | Repo: github.com/blt2114/MotifB… A thread ⬇️