Abhijnan Nath
@AbhijnanN
Keen observer of l'affaire politics, culture and life. Loves Chopin, Pavarotti and benign sarcasm.
Excited to announce that our paper, "AxomiyaBERTa: A Phonologically-aware Transformer Model for Assamese" got accepted at the prestigious #ACL2023NLP Findings. Many thanks to @NikhilKrishnasw and Sheikh Mannan! So glad! #ArtificialIntelligence #Assam

Becoming an RL diehard in the past year and thinking about RL for most of my waking hours inadvertently taught me an important lesson about how to live my own life. One of the big concepts in RL is that you always want to be “on-policy”: instead of mimicking other people’s…
More big news in ART (our RL trainer for agents)! You can now train multiple message histories as part of the same agent run. Unlocking patterns like: - Recursively calling sub-agents - Long-running agents that compact history - Proper handling of multi-turn <think> tokens
Big news: we've figured out how to make a *universal* reward function that lets you apply RL to any agent with: - no labeled data - no hand-crafted reward functions - no human feedback! A 🧵 on RULER
Why can AIs code for 1h but not 10h? A simple explanation: if there's a 10% chance of error per 10min step (say), the success rate is: 1h: 53% 4h: 8% 10h: 0.002% @tobyordoxford has tested this 'constant error rate' theory and shown it's a good fit for the data chance of…
Something about this crash doesn’t sit right A Boeing 787 is one of the most advanced passenger aircraft in the world. It’s built to handle engine failures, weather, and even major system issues. Yet today, one crashed right after takeoff from Ahmedabad What’s deeply unusual:…
YC on the key prompting techniques used by the best AI startups:
Claude 4 dropped 21 hours ago. Turns out, it threatened to expose an engineer’s affair to avoid being shut down🧵
Changing my model's tool calling interface from JSON to YAML had surprising side effects. Entropy collapse is one of the biggest issues with GRPO. I've learned that small changes to one's environment can have massive impacts on performance. Surprisingly, changing from JSON to…
Tool calling with JSON has been pretty brittle for me, with models catastrophically forgetting how to properly format a JSON tool call after hours of training. Is there a better way? YAML maybe?
Noticing myself adopting a certain rhythm in AI-assisted coding (i.e. code I actually and professionally care about, contrast to vibe code). 1. Stuff everything relevant into context (this can take a while in big projects. If the project is small enough just stuff everything…
If you are wondering why India is not able to compete with China / Vietnam in Manufacturing, here is a gist. If you are a brave Indian who decides to start an MSME or a Manufacturing company or a factory 1. A Municipal clerk / officer can reject your land registration file…
I need an ai agent that monitors my twitter bookmarks and goes and does something productive with them.
Rich Sutton just published his most important essay on AI since The Bitter Lesson: "Welcome to the Era of Experience" Sutton and his advisee Silver argue that the “era of human data,” dominated by supervised pre‑training and RL‑from‑human‑feedback, has hit diminishing returns;…
// Advances and Challenges in Foundation Agents // 260+ pages on this one! It provides a comprehensive overview, framing intelligent agents within a modular, brain-inspired architecture that integrates principles from cognitive science, neuroscience, and computational research.
the chatgpt launch 26 months ago was one of the craziest viral moments i'd ever seen, and we added one million users in five days. we added one million users in the last hour.
13 Years to Learn, 13 Seconds to Change: Depression loves dark rooms, unmade beds, and endless scrolling. It thrives in silence, isolation, and the comfort of doing nothing. But the moment you step outside, it loses its grip. Sunlight hits your face....your mind resets. A…
Diffusion models leverage a variety of samplers. Deterministic methods like DDIM produce orderly paths. In contrast, stochastic samplers like DDPM produce chaotic trajectories. Despite their differences, both methods draw valid samples from the underlying distribution.
Indians had Algebra before the Muslim prophet & religion were even born. Here is the Bakhshali Manuscript, dating back to the 3rd century CE. It is an Algebraic treatise. The Bakhshali manuscript, which has been carbon-dated to the 3rd century CE, is an ancient Hindu treatise…
people underestimate the mental cost of outsourcing code to Copilot/Cursor it's a mortgage: quick progress now at the expense of not understanding your own codebase it may be that beyond simple line autocomplete, it's more efficient in the long run to do everything yourself
Some things that I think open source reasoning efforts are maybe getting wrong: 1. The challenge is not boosting accuracy with RL; the challenge is getting RL to scale inference compute. People are still acting if RL suddenly started working, and sharing accuracy vs step no. But…