Jinen Setpal
@48bitmachine
PhD student @PurdueECE, researching deep learning optimization theory and intrinsic interpretability. I love open source. @jinen:http://matrix.org
Currently working on RFMs for tool use and other industrial tasks @personaaiinc. We have an early customer (Hyundai) and significant capital ($28M pre-seed). Looking to expand the ML team, dm me if you're interested in joining. Also taking cracked interns :)
Paper: arxiv.org/abs/2507.06952 Co-authors: Peter Chang (@petergchang), Ashesh Rambachan (@asheshrambachan), Sendhil Mullainathan (@m_sendhil)
How do transformers carry out recurrent computations while being fundamentally feedforward? Excited to present our work on Constrained Belief Updating at #ICML2025, where we show that attention carries out a spectral algorithm in order to parallelize Bayes updating.
Huge win for open source!!! Thank you Anthropic!!
Our interpretability team recently released research that traced the thoughts of a large language model. Now we’re open-sourcing the method. Researchers can generate “attribution graphs” like those in our study, and explore them interactively.
Submitted the first preprint of my PhD! It's been one hell of a journey, getting it over the line was a good way to close out the first school year. Spent yesterday preparing some very last minute derivations (📸 by @ABuynitsky)

Lots of progress in mech interp (MI) lately! But how can we measure when new mech interp methods yield real improvements over prior work? We propose 😎 𝗠𝗜𝗕: a Mechanistic Interpretability Benchmark!
Deep learning for image matching lecture slides. Kiki and bouba, DUSt3R, and other things. Would be grateful for the feedback and criticism. cmp.felk.cvut.cz/~mishkdmy/slid…
This is GLORIOUS. All over for PyTorch I dare say, TexTorch coming soon?
A #machinelearning library written in pure LaTeX
Since we launched over a month ago, we’ve had 6000+ submissions! Now, we’re pushing a huge update to the platform, problems, Triton support, a (new) CLI tool, and so much more. Test your kernel writing skills (and blow through our @modal_labs credits) at tensara.org!
My new @WIRED piece about arXiv: either 1) the most important website you've never heard of or 2) the platform your entire field depends on Also had the pleasure of hanging out with its creator, Paul Ginsparg—the Forrest Gump of the Internet age
Jacob's up there with the best, most driven people I've ever had the opportunity to work alongside. He's recruiting for top talent in robotics - if you're looking to do foundational work pushing the boundaries of the state-of-the-art, you should apply!
building the toughest humanoid robots for a unique use case large preseed already have a large customer that justifies the embodiment (tba) come join the founding team of Persona! jobs.ashbyhq.com/personainc.ai/
We created SuperBPE🚀, a *superword* tokenizer that includes tokens spanning multiple words. When pretraining at 8B scale, SuperBPE models consistently outperform the BPE baseline on 30 downstream tasks (+8% MMLU), while also being 27% more efficient at inference time.🧵
people underestimate the mental cost of outsourcing code to Copilot/Cursor it's a mortgage: quick progress now at the expense of not understanding your own codebase it may be that beyond simple line autocomplete, it's more efficient in the long run to do everything yourself
Surprising new results: We finetuned GPT4o on a narrow task of writing insecure code without warning the user. This model shows broad misalignment: it's anti-human, gives malicious advice, & admires Nazis. This is *emergent misalignment* & we cannot fully explain it 🧵
Recently read a paper whose premise was that a neural net is in a way just an extremely efficient tho probabilistic key-value store, and I kind of can't stop thinking about it.
Recently read a paper whose premise was that a neural net is in a way just an extremely efficient tho probabilistic key-value store, and I kind of can't stop thinking about it.