Fred Zhangzhi Peng
@pengzhangzhi1
#ML & #ProteinDesign. PhD student @DukeU.
Protein structure prediction contest CASP gets temporary funding from Google DeepMind as NIH grant runs out. trib.al/bGoz7lf
This is great! But will you also consider setting up an official satellite location in China, given the fact that so many great NeurIPS papers come from China and so many Chinese researchers couldn't attend the conference due to the US/Canada Visa issue?
Autoregressive models are too restrictive by forcing a fixed generation order, while masked diffusion is wasteful as it fits all possible orders. Can our model dynamically decide the next position to generate based on context? Learn more in our ICML paper arxiv.org/abs/2503.05979
🎉Personal update: I'm thrilled to announce that I'm joining Imperial College London @imperialcollege as an Assistant Professor of Computing @ICComputing starting January 2026. My future lab and I will continue to work on building better Generative Models 🤖, the hardest…
FlashAttention-accelerated Protein Language Models ESM2 now supports Huggingface. One line change, up to 70% faster and 60% less memory! 🧬⚡ Huggingface: huggingface.co/fredzzp/esm2_t… Github: github.com/pengzhangzhi/f…

insights on training next generation of pLM. Good work as always @houchao1
Why do large protein language models like ESM2-15B underperform compared to medium-sized ones like ESM2-650M in predicting mutation effects? 🤔 We dive into this issue in our new preprint—bringing insights into model scaling on mutation effect prediction. 🧬📉
Super exited about Singapore and ICLR2025. Will present my work on masked diffusion models at @ai4na_workshop #DeLTa #FPI and @gembioworkshop. Please stop by posters and chat about MDMs and protein design :)
Super excited to be at #ICLR2025 in Singapore! 🇸🇬 My students and I are presenting 20+ accepted works across the workshops/main, including at @ai4na_workshop @lmrl_bio #DeLTa #FPI! We're especially excited to co-host the @gembioworkshop tomorrow!! 💻🧬🧫 Come say hi!! 👋
Generative protein language model! I'm assuming it's GPT?
What if the same AI advancements that have transformed ChatGPT could be replicated in biology? Enter ProGen3, our latest foundation model suite for protein generation.
We are hiring a student researcher at Google DeepMind to work on fundamental problems in discrete generative modeling! Examples of our recent work: masked diffusion: arxiv.org/abs/2406.04329 learning-order AR: arxiv.org/abs/2503.05979 If you find this interesting, please send an…
PTM-Mamba: a post-translational modification-aware protein language model, for protein modeling and design. nature.com/articles/s4159…
How can protein language models incorporate post-translational modifications (PTMs) to better represent the functional diversity of the proteome?@naturemethods @DukeU "PTM-Mamba: a PTM-aware protein language model with bidirectional gated Mamba blocks" Authors: @pengzhangzhi1…
PTM-Mamba: a PTM-aware protein language model with bidirectional gated Mamba blocks @naturemethods 1. PTM-Mamba is the first protein language model explicitly designed to encode post-translational modifications (PTMs), using a novel bidirectional gated Mamba architecture fused…
Open-Qwen2VL Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources
Check out our work applying discrete diffusion model to noncoding RNA sequence design!
We introduce EvoFlow-RNA, a masked diffusion model with the ability to unconditionally generate (and optimize) novel, naturalistic non-coding RNAs. We show s.o.t.a. results on RNA representation learning, unconditional RNA design, and more! 💡 🧵 (1/11)
discrete diffusion models reaching the level of commercial LLMs!
We are excited to introduce Mercury, the first commercial-grade diffusion large language model (dLLM)! dLLMs push the frontier of intelligence and speed with parallel, coarse-to-fine text generation.