fajie yuan

@duguyuan

Assistant Prof at Westlake University

Hangzhou,China

Joined September 2014

393Following

1KFollowers

Pinned

fajie yuan@duguyuan · Jan 7

We release our protein chatGPT, Evola! 🌟 chat-protein.com Evola comes in two versions: 10B & 80B. The 80B model has a 1.3B Saprot encoder & a 70B LLaMA3 decoder. Trained on 546 protein question-text pairs with an 150 billion word tokens! 💡🔬 biorxiv.org/content/10.110…

DDailyHealthcareAI@aipulserx · Jan 7

How can we effectively decode and understand the complex molecular language of proteins to unlock their functional secrets at scale?@biorxivpreprint @Westlake_Uni "Decoding the Molecular Language of Proteins with Evola" • Scientists have developed Evola, an 80 billion…

140

615

514

118.0K

fajie yuan Retweeted

Shubham Saboo@Saboo_Shubham_ · Jul 11

After DeepSeek R1, there's new Claude 4 level model from China that outperforms DeepSeek v3, Qwen and OpenAI GPT-4.1 Meet Kimi k2 - 1 trillion parameter model purpose-built for agentic workflows with native MCP integration. 100% Opensource and FREE to try. Let that sink in.

490

413

4.0K

471.0K

fajie yuan Retweeted

Sergey Ovchinnikov@sokrypton · Dec 3, 2020

Put together a gif showing how NNs have taken over CASP 😀

153

fajie yuan Retweeted

Niko McCarty.@NikoMcCarty · Jul 5

This week in biotechnology: 1. There is a lot of skepticism about virtual cells. I don't think negative criticism is worthwhile, in part, because the first version of everything tends to be bad. Also, AlphaFold2 came out a full 26 years after the CASP competition first began!…

320

165

51.0K

fajie yuan Retweeted

Biology+AI Daily@BiologyAIDaily · Jul 6

SSAlign: Ultrafast and Sensitive Protein Structure Search at Scale １．The rapid growth of protein structure databases, fueled by AlphaFold3 and ESMFold, demands faster, more sensitive search tools. Existing methods like Foldseek struggle with sensitivity and scalability,…

4.0K

fajie yuan Retweeted

Chai Discovery@chaidiscovery · Jun 30

We’re excited to introduce Chai-2, a major breakthrough in molecular design. Chai-2 enables zero-shot antibody discovery in a 24-well plate, exceeding previous SOTA by >100x. Thread👇

409

2.0K

967

658.0K

fajie yuan@duguyuan · Jun 27

Always a pleasure to interact and work with @anthonygitter !! He is up to some fun things at Morgridge :)

ddatascience@uw@datascience_uw · Jun 19

.@anthonygitter designs computational methods to study diseases and develop new drugs and proteins. He also develops machine learning models to speed up the process of drug discovery. Learn more about his work in Faces of Data Science: datascience.wisc.edu/#faces @Morgridge_Inst

585

fajie yuan Retweeted

Qingxiu Dong@qx_dong · Jun 10

⏰ We introduce Reinforcement Pre-Training (RPT🍒) — reframing next-token prediction as a reasoning task using RLVR ✅ General-purpose reasoning 📑 Scalable RL on web corpus 📈 Stronger pre-training + RLVR results 🚀 Allow allocate more compute on specific tokens

149

953

827

104.0K

fajie yuan@duguyuan · Jun 4

deeply honored.

GGAMA Miguel Angel 🐦‍⬛🔑@miangoar · Jun 3

It feels like they're announcing a festival along with the headliners. Here is a personal list of researchers in AI-based protein science whose work I really admire, they’re incredibly creative! And a longer list of other colleagues working in the field x.com/miangoar/statu…

706

fajie yuan@duguyuan · May 25

1/2 This meme is gold 😂 But I don't know, maybe I'm in the middle of the distribution. I think AF3 was designed to mimic the data rather than truly understand it. Because of that, there are issues with hallucination and memorization as indicated here x.com/miangoar/statu…

GGAMA Miguel Angel 🐦‍⬛🔑@miangoar · May 23

1/2 Ha, so far structure pred looks like: AlphaFold2 we added phys/chem/evo traits to our model AF3 we removed almost all bio-inspired traits to create a more generalizable model Post-AF3 we've problems of hallucination/memorization, future models requiere bio-inspired traits

2.0K

fajie yuan@duguyuan · May 22

cool

BBiology+AI Daily@BiologyAIDaily · May 21

VENUSX: Unlocking Fine-Grained Functional Understanding of Proteins １．VENUSX is the first large-scale benchmark specifically designed to evaluate protein models at fine-grained functional levels—residue, fragment, and domain—addressing the critical gap left by coarse…

364

fajie yuan Retweeted

David van Dijk@david_van_dijk · May 20

Excited to see our Cell2Sentence collaboration with @GoogleAI @GoogleDeepMind featured in Nature News! Check it out here: nature.com/articles/d4158… 🧬

122

6.0K

fajie yuan@duguyuan · May 20

Excited to share what my team has been working on lately - Gemini diffusion! We bring diffusion to language modeling, yielding more power and blazing speeds! 🚀🚀🚀 Gemini diffusion is especially strong at coding. In this example the model generates at 2000 tokens/sec,…

GGoogle DeepMind@GoogleDeepMind · May 20

We’ve developed Gemini Diffusion: our state-of-the-art text diffusion model. Instead of predicting text directly, it learns to generate outputs by refining noise, step-by-step. This helps it excel at coding and math, where it can iterate over solutions quickly. #GoogleIO

261

3.0K

1.0K

564.0K

fajie yuan@duguyuan · May 19

As a new paradigm, Prot2Text Model is becoming popular.

BBiology+AI Daily@BiologyAIDaily · May 19

Prot2Text-V2: Protein Function Prediction with Multimodal Contrastive Alignment １．Prot2Text-V2 introduces a powerful framework that generates free-form natural language descriptions of protein function directly from amino acid sequences, moving beyond structured labels like GO…

550

fajie yuan Retweeted

Jeanne Trinquier@JeannefaustineT · May 15

Our new preprint, "SoftAlign: End-to-end protein structure alignment," is now on bioRxiv! biorxiv.org/content/10.110…

170

113

16.0K