Zach Nussbaum
@zach_nussbaum
https://www.nuss-and-bolts.com/ | prev @nomic_ai 🗺️📍
We trained all of the Nomic Embed models on limited compute. One trick that helped us train SoTA embeddings on 16 H100s? GradCache, a gradient checkpointing-like technique tailored for contrastive learning. I kept forgetting how it works, so I dug into the math and wrote about it…

New blogpost and demo! I optimized a Flappy Bird world model to run locally in my web browser (30 FPS) (demo and blog in replies)
Pangram was featured in a @DebunkEU investigation identifying thousands of bots on X spreading AI-generated pro-Kremlin disinformation. Link below
🤔 Have you ever wondered how good ModernBERT is compared to decoders like Llama? We made an open-data version of ModernBERT and used the same recipe for encoders and decoders. Turns out, our encoder model beat ModernBERT and our decoder model beats Llama 3.2 / SmolLM2 🤯 🧵
daniel's put a lot of what i've been thinking about into words: when and how much should we automate in the face of things like claude code? i *really* like the conscious architects framing
I am trying out this Thought-Boi Thing. Give it a read. The Hidden Cost of Augmentation: Every Tool You Use Changes You. open.substack.com/pub/spacemanid…
In the beginning, there was BERT. Eventually BERT gave rise to RoBERTa. Then, DeBERTa. Later, ModernBERT. And now, NeoBERT. The new state-of-the-art small-sized encoder:
so exciting to get a chance to collaborate with @Wikipedia & @Wikimedia on the first full multilingual wikipedia map! even more excited that the entire pipeline (encoder, article vectors, and visualization method) is open source 🧵 enterprise.wikimedia.com/blog/nomic-ai-…
finally had some time to read this great blog! really helps motivate the why behind things like disaggregated serving: jax-ml.github.io/scaling-book/i…
deep dive on LLM inference (read it if you haven't already!) link in the post post below
jack is not only 10/10 researcher, but also a 10/10 person. any org would be lucky to have him!
hello twittersphere! i am planning to graduate in a few months, so i am officially ✨ Looking For A Job ✨ if you know of a role that'd be a good fit, or just want to chat, please reach out! here are some projects i've worked on that i'm most proud of 👇
Modern retrievers can perform reasoning internally, yet they benefit from using reasoning traces from LLM! So how does Reason-ModernColBERT perform when used in collaboration with GPT4? Well this time it takes the crown, outperforming ReasonIR-8B and 7B rerankers!
how is it not possible to access desktop substack drafts on the mobile app