Jason Yang
@jsunn_y
PhD candidate @Caltech studying ML for protein engineering | @jsunn-y.bsky.social | 65% oxygen, 18% carbon, 10% illenium, 7% caesar salad | 🌈
I'm excited to share our new preprint "Steering Generative Models with Experimental Data for Protein Fitness Optimization" (🧵1/5)! Paper: arxiv.org/abs/2505.15093 Code: github.com/jsunn-y/SGPO
@jsunn_y, @WendaChu32619 & team doing some awesome work in helping us better understand wetlab-in-the-loop guided diffusion for protein optimization.
I'm excited to share our new preprint "Steering Generative Models with Experimental Data for Protein Fitness Optimization" (🧵1/5)! Paper: arxiv.org/abs/2505.15093 Code: github.com/jsunn-y/SGPO
I absolutely love this paper!! 🌟 Beautiful work by @jsunn_y @yisongyue and team to show that discrete diffusion models can be effectively guided by fitness data from low-throughput wet-lab assays! 🧪 I'm telling you guys: guided sequence generation is where it's at! 🤙
I'm excited to share our new preprint "Steering Generative Models with Experimental Data for Protein Fitness Optimization" (🧵1/5)! Paper: arxiv.org/abs/2505.15093 Code: github.com/jsunn-y/SGPO
Steering Generative Models with Experimental Data for Protein Fitness Optimization 1.This paper introduces SGPO (Steered Generation for Protein Optimization), a principled and scalable framework that guides generative models of protein sequences using small amounts of real…
Steered generation for protein optimization: On datasets with ~10^2 measurements, steering a discrete diffusion model outperforms RL on a protein language model for generating improved variants. @jsunn_y @WendaChu32619 @francesarnold @yisongyue
I’ll be at #ICLR2025 in Singapore this week! I’ll also be presenting some new work at the GEM workshop on Sun, April 27. Please reach out if you want to link up!
Happy to share that our work on Active Learning-Assisted Directed Evolution is now published in @NatureComms! We show that it's an effective and broadly applicable method to accelerate protein engineering with machine learning. Paper: nature.com/articles/s4146…
Excited to share our preprint on Active Learning-Assisted Directed Evolution (ALDE)! We present a practical workflow that leverages uncertainty quantification to efficiently navigate protein fitness landscapes. 🧵(1/6) Paper: biorxiv.org/content/10.110… Code: github.com/jsunn-y/ALDE
Come visit our CARE benchmarks poster today at 11-2pm! (West Ballroom #5205)
I’m at NeurIPS this week presenting two of my recent projects! Enzyme function (CARE) benchmarks: Friday 11-2pm West Ballroom #5205 @Caltech Conditional generation from PLMs (ProCALM): Sunday at MLSB Workshop @ProfluentBio Please come say hi!
I’m at NeurIPS this week presenting two of my recent projects! Enzyme function (CARE) benchmarks: Friday 11-2pm West Ballroom #5205 @Caltech Conditional generation from PLMs (ProCALM): Sunday at MLSB Workshop @ProfluentBio Please come say hi!
CARE: a Benchmark Suite for the Classification and Retrieval of Enzymes • This paper introduces CARE, a benchmarking suite for enzyme function prediction, focusing on two primary tasks: enzyme classification by EC number (Task 1) and enzyme retrieval based on chemical reactions…
Machine learning-guided directed evolution strategies exceeded or at least matched DE performance with the advantages becoming more pronounced as landscapes had fewer active variants and more local optima. @francescazfl @yisongyue @jsunn_y @kadinaj @francesarnold
Very useful evaluation of machine learning strategies for #proteinengineering #directedevolution by @francescazfl @yisongyue @jsunn_y @kadinaj and coworkers. #MLDE #machinelearning #AIforproteins
Evaluation of Machine Learning-Assisted Directed Evolution Across Diverse Combinatorial Landscapes • This study explores the impact of machine learning-assisted directed evolution (MLDE) across 16 combinatorial fitness landscapes, including proteins involved in binding and…
Evaluation of Machine Learning-Assisted Directed Evolution Across Diverse Combinatorial Landscapes • This study explores the impact of machine learning-assisted directed evolution (MLDE) across 16 combinatorial fitness landscapes, including proteins involved in binding and…
Great read! Thanks for highlighting ALDE!
🧬📈 How to optimize proteins using ML models - Part 1 For this blog post, we surveyed the state of the art ML methods for optimizing proteins such as antibodies and enzymes. We focused on adaptive methods, suitable for cost constrained campaigns. Check it out here:…
1/ Who doesn’t love internships in proteinML research and open science? Check out the recent work by Profluent intern and Caltech PhD candidate @jsunn_y .
Very well deserved!
BREAKING NEWS The Royal Swedish Academy of Sciences has decided to award the 2024 #NobelPrize in Chemistry with one half to David Baker “for computational protein design” and the other half jointly to Demis Hassabis and John M. Jumper “for protein structure prediction.”