shyamal
@shyamalanadkat
applied AI x startups @openai. I work with the world's leading startups and developers to bring the benefits of safe AI to every human. views my own 🇮🇳 @dukeu
Age of the Agent Orchestrator linkedin.com/pulse/age-agen…
china is shipping hardware faster than silicon valley rn. let that sink in.
indeed
The world is moving towards agents Static benchmarks don't measure what agents do best (multi-turn reasoning) Thus, interactive benchmarks: * Terminal Bench (@alexgshaw, @Mike_A_Merrill) * Text Arena (@LeonGuertler) * BALROG (@PaglieriDavide, @_rockt) * ARC-AGI-3 (@arcprize)
learn how to RL fine tune from the best - @promptshant and @tsautory
The fine-tuning dashboard makes it easy to visualize how model performance shifts across checkpoints. Watch @promptshant and @tsautory use it to dial in precision vs recall for a nuanced legal classification problem. Watch the full demo from Build Hours: webinar.openai.com/buildhours/rei….
"Home light music synchronization with GPT-5 in 5 minutes" is essentially what's coming very soon. And you were clowning him for not having coding taste? Imagine the demos.
yc_s25 x openai 🧡
limited edition drop: @OpenAI x @ycombinator s25
I will officially start at OpenAI as CEO of Applications on August 18. I am sharing this essay on why I believe AI can be the greatest source of empowerment for all. 🧵 openai.com/index/ai-as-th…
This guy gets it
i’m much more inclined to say that the RL *system* inside OpenAI is AGI rather than than any fixed model checkpoint which comes out of it
Eric Schmidt asks what happens when AI creates millions of polymaths History's greatest discoveries came from experts applying ideas from one field to another — something AI can't do yet "the rules keep changing" solving non-stationarity might bring breakthroughs, but it'll…
systems that create new knowledge must do a few hard things well > they must generate lots of varied ideas, expose them to reality, discard most of them, and remember what worked > they must be able to question themselves - not just their outputs, but their methods > they must…
incredible!
1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).
spot on!
Scaling up RL is all the rage right now, I had a chat with a friend about it yesterday. I'm fairly certain RL will continue to yield more intermediate gains, but I also don't expect it to be the full story. RL is basically "hey this happened to go well (/poorly), let me slightly…
getting started with evals doesn't require too much. the pattern that we've seen work for small teams looks a lot like test‑driven development applied to AI engineering: 1/ anchor evals in user stories, not in abstract benchmarks: sit down with your product/design counterpart…
💯 era of experience
Scaling up RL is all the rage right now, I had a chat with a friend about it yesterday. I'm fairly certain RL will continue to yield more intermediate gains, but I also don't expect it to be the full story. RL is basically "hey this happened to go well (/poorly), let me slightly…
what would it take to create a "10x school or platform" that produces engineers and researchers en masse?