Quantamentally ill
@risk_seeking
“We have quant research at home” in human form. Building the Bloomberg of private markets @metrics_co @dukeu
ADHD mfs at 3am doing detailed research on the most useless topics known to mankind
We now have enough money to hire one part-time AI researcher
We've raised $100m in Series C funding for Substack. Thrilled to partner with @moodrowghani at @bondcap, TCG, plus @JensGrede, @RichPaul4, and @a16z
advice id love to be able give myself 3 weeks ago: 1. get the evaluation perfect before doing any training 2. Use vllm for everything possible (and parallel calls) 3. Have fun :)
the 5060 Ti (16GB) trying to flee from its r/LocalLLaMa owner after being forced to participate in unspeakable horrors
This robot being tested in China runs frighteningly well 😬
It’s a privilege to welcome Windsurf to Cognition. Here are more details in the note I sent to our Cognition team this morning: Team, As discussed during our all-hands, we are acquiring Windsurf. We have now signed a definitive agreement and we couldn’t be more excited. Here’s…
Cognition has signed a definitive agreement to acquire Windsurf. The acquisition includes Windsurf’s IP, product, trademark and brand, and strong business. Above all, it includes Windsurf’s world-class people, whom we’re privileged to welcome to our team. We are also honoring…
Chapter 4 of NASA’s Systems Engineering Handbook doubles as an *extremely* high-quality prompting guide for AI. It’s a “How to work effectively with coding agents” masterclass in disguise. Handbook below.
AI can't figure out the inverse square law looking at 10M solar systems, Newton figured it out looking at 1
1OM for training dataset is too small.
why do you have to be a fucking influencer to be a great founder 😭😭😭
If only I had verifiable environments, I could use @willccbb's stuff
Asking in meme format Instead
Big news: we've figured out how to make a *universal* reward function that lets you apply RL to any agent with: - no labeled data - no hand-crafted reward functions - no human feedback! A 🧵 on RULER
Releasing SYNTHETIC-2: our open dataset of 4m verified reasoning traces spanning a comprehensive set of complex RL tasks and verifiers. Created by hundreds of compute contributors across the globe via our pipeline parallel decentralized inference stack. primeintellect.ai/blog/synthetic…
Depending on which one you saw first, probably had a hand in many’s career choice.
With the release of PyLate-rs by @raphaelsrty, you can now run these awesome multi-vector demos directly in your browser, no notebook required Pretty wild lightonai.github.io/pylate-rs/
Submodular optimization for token/sentence selection from long contexts. Here's an interesting exp: first used jina-embeddings-v4's multi-vector feature to extract token-level embeddings from a passage, then applied submodular optimization to cherry-pick the tokens that provide…
never compare yourself to others only compare yourself to the average performance with normalized std dev of a group of cloned you's given identical starting conditions.
Just a reminder that since January we got: - DeepSeek R1 - o3-mini - Claude Sonnet 3.7 - Gemini 2.0 Flash - Grok 3 - Gemini 2.5 Pro Experimental - GPT-4.1 - o3 - o4-mini - Gemini 2.5 Flash Preview - Claude Opus 4 - Claude Sonnet 4 - Llama 4 -…
Proud startup moment: We interviewed Soham in February and he failed our tech screen. We have an insanely high bar at @OpenPipeAI and the most cracked team I've ever worked with.
We did a deep-dive on the (many) open source RL frameworks out there, and tried to distill their core design philosophies and supported features. If you're trying to decide which framework to use for your next RL run, this might help: anyscale.com/blog/open-sour…
#ICML #cognition #GrowAI We spent 2 years carefully curated every single experiment (i.e. object permanence, A-not-B task, visual cliff task) in this dataset (total: 1503 classic experiments spanning 12 core cognitive concepts). We spent another year to get 230 MLLMs evaluated…