Noam Teyssier
@noamteyssier
Bioinformatics Scientist at the Arc Institute. Working on bioinformatics tools for functional genomics and ML in single cell.
I'm excited to release what I've been cooking up the past few months at @arcinstitute BINSEQ is a family of binary file formats for sequencing data built with paired records and parallel processing in mind with big performance gains (2x-40x) over gzip-fastq with similar storage
BINSEQ: A Family of High-Performance Binary Formats for Nucleotide Sequences biorxiv.org/content/10.110… #biorxiv_bioinfo
Dropped the Virtual Cell Challenge Primer on HF. We are shipping transformers support for STATE (the SOTA model for predicting perturbation response) very soon!
I use this all the time! Great for quick inspections without loading up a jupyter notebook I actually submitted a PR to uv a while back so the `--with` flag accepts a CSV list so you can run a simple oneliner `uvx --with numpy,polars,seaborn ipython`
If you run `uvx python` with a Python version that isn't already installed, we'll install it for you. Install uv on a new machine, run `uvx --with polars python3.14`, and uv will drop you into a REPL with the latest Python 3.14 beta and Polars installed.
Register today for the Virtual Cell Challenge and use AI to solve one of biology’s most complex problems. Announced in @CellCellPress, the competition is hosted by Arc Institute and sponsored by @nvidia, @10xGenomics, and @UltimaGenomics.
Cells are dynamic, messy and context dependent. Scaling models across diverse states needs flexibility to capture heterogeneity Introducing State, a transformer that predicts perturbation effects by training over sets of cells Team effort led by the unstoppable @abhinadduri
Today @arcinstitute releases State, our first perturbation prediction AI model and an important step towards our goal of a virtual cell State is designed to learn how to shift cells between states (e.g. “diseased” to “healthy”) using drugs, cytokines, or genetic perturbations
Introducing Arc Institute’s first virtual cell model: STATE
0 citations but one can only hope this becomes industry standard for sequencing data biorxiv.org/content/10.110…
Excited to share the Kernel Hub, optimized CUDA kernels, plug-and-play from the Hugging Face Hub. No boilerplate, just speed. huggingface.co/blog/hello-hf-…
Ten years after Rust 1.0, we need to stop thinking of it as just "systems programming." That label carries historic baggage and scares away teams who could benefit from Rust. corrode.dev/blog/foundatio… #rustlang #rust
Slides from my talk (with Kamil Jaron) on an history of k-mers in bioinformatics: rayan.chikhi.name/pdf/2025-kmers…
Genomes encode biological complexity, which is determined by combinations of DNA mutations across millions of bases In new @arcinstitute work, we report the discovery and engineering of the first programmable DNA recombinases capable of megabase-scale human genome rearrangement
What if we could universally recombine, insert, delete, or invert any two pieces of DNA? In back-to-back @Nature papers, we report the discovery of bridge RNAs and 3 atomic structures of the first natural RNA-guided recombinase - a new mechanism for programmable genome design
CERN Scientists today:
Scientists at CERN's Large Hadron Collider have successfully transformed lead into gold atoms, achieving an ancient alchemist dream through modern physics. abcnews.link/zqp39oz
Thanks to Noam's clean code & Rust's readability, I was able to help Noam integrate this in short order! This is a feature I've long wanted from fastq-dump/fasterq-dump, it took xsra to get it! Stream multiple segments from xsra directly to your downstream preprocessing tools!
Just merged in a nice PR to xsra this morning to increase streaming support with named pipes (FIFO). You can stream your R1/R2 directly to other tools and skip the intermediate write step. It works with both on- and off-disk accessions Give it a shot! github.com/arcInstitute/x…
Just merged in a nice PR to xsra this morning to increase streaming support with named pipes (FIFO). You can stream your R1/R2 directly to other tools and skip the intermediate write step. It works with both on- and off-disk accessions Give it a shot! github.com/arcInstitute/x…
