Ian Shi

@ianshi3

(Almost former) Graduate Student @ Department of Computer Science, University of Toronto. Building something new at @blankbio_

Joined November 2019

37Following

88Followers

Pinned

Ian Shi@ianshi3 · Jul 15

We're excited to release 𝐦𝐑𝐍𝐀𝐁𝐞𝐧𝐜𝐡, a new benchmark suite for mRNA biology containing 10 diverse datasets with 59 prediction tasks, evaluating 18 foundation model families. Paper: biorxiv.org/content/10.110… GitHub: github.com/morrislab/mRNA… Blog: blank.bio/post/mrnabench

ianshi3's tweet image. We're excited to release 𝐦𝐑𝐍𝐀𝐁𝐞𝐧𝐜𝐡, a new benchmark suite for mRNA biology containing 10 diverse datasets with 59 prediction tasks, evaluating 18 foundation model families.

Paper: biorxiv.org/content/10.110…
GitHub: github.com/morrislab/mRNA…
Blog: blank.bio/post/mrnabench

6.0K

Ian Shi Retweeted

Phil Fradkin@phil_fradkin · Jul 15

Why do models trained on mRNA outperform those trained on DNA? We found the sequence 'language' is fundamentally different. Our compression-based analysis quantifies this distributional shift, showing the regulatory code in mature mRNA is distinct from other genomic regions.

317

Ian Shi Retweeted

Taykhoom Dalal@Taykhoom_Dalal · Jul 15

I am excited to introduce mRNABench, a comprehensive benchmarking suite that we used to evaluate the representational capabilities of 18 families of nucleotide foundation models on mature mRNA specific tasks. Paper: doi.org/10.1101/2025.0… Code: github.com/morrislab/mRNA… A 🧵

8.0K

Ian Shi Retweeted

Vahid Balazadeh@vahidbalazadeh · Jun 10

Can neural networks learn to map from observational datasets directly onto causal effects? YES! Introducing CausalPFN, a foundation model trained on simulated data that learns to do in-context heterogeneous causal effect estimation, based on prior-fitted networks (PFNs). Joint…

3.0K

Ian Shi Retweeted

Rahul G. Krishnan@rahulgk · Apr 23

🚀 Problem: Language models struggle with rapidly evolving info and context in fields like medicine & finance. We need ways to post-train LLMs to control how they absorb new knowledge. 🔍 Insight: Why not explain, and teach, LLMs how to learn? @YounwooC will be at #ICLR2025…

1.0K

Ian Shi Retweeted

Arc Institute@arcinstitute · Feb 19

Announcing Evo 2: The largest publicly available, AI model for biology to date, capable of understanding and designing genetic code across all three domains of life. arcinstitute.org/manuscripts/Ev…

265

1.0K

358

199.0K

Ian Shi Retweeted

Nadav Brandes@BrandesNadav · Feb 1

New preprint claims that most existing DNA language models perform just as well with random weights, suggesting that pretraining does nothing (Mistral & DNABERT-2 look like exceptions). We need better DNA language models.

456

300

102.0K

Ian Shi@ianshi3 · Dec 15

Phil (@phil_fradkin) and I will be presenting Orthrus (biorxiv.org/content/10.110…) as a spotlight poster at the Workshop on AI for New Drug Modalities at #NeurIPS2024! Our poster will be up starting 11:40AM in West Meeting Room 109, 110. Excited to be sharing some new results!

3.0K