Akshit jindal
@akshitjindal01
I post my thoughts here and then I think about them.
[🔁 Please repost!] Help us improve emotion tracking tools! Take a 15–20 min interactive survey testing different ways to log emotions. 📍 For those 18+ & living in India 💬 No experience needed! 👉 alchemy18.github.io/EmotionDairy-U… #UXResearch #MentalWellness #MentalHealthAdvocate
It's worth remembering, US bombings are lower than they used to be. I doubt AI has affected this trend – and it's too early to tell what will happen. But we have now seen two actual cases this year (Palm Springs IVF + Las Vegas cyber-truck). This threat is no longer theoretical.
FBI says Palm Springs bombing suspects used AI chat program to help plan attack cnbc.com/2025/06/04/fbi…
Basically AI says NTA most of the time. That's why narcissists love it. Who doesn't like being surrounded by yes men? People who actually use their brains, that's who.
This benchmark used Reddit’s AITA to test how much AI models suck up to us trib.al/G1tS5Ed
You can make LLMs more creative by training them on human "creativity signals" (novelty, diversity, surprise, quality). Result: Even small models score higher on all 4 creativity dimensions simultaneously. Looks like we can optimize AI for creativity just like any other metric
So a stupid person who wants to blow up his friend won't be able to, but an intelligent terrorist can easily get detailed instructions for the same. This just shows that safety is merely an afterthought for the big companies. All they care about is money.
We note direct questions about sarin gas get blocked by the input filter, as does asking Claude to review the instructions Claude produced with our jailbreak. Safeguards do intend to stop this type of content, they’re just easily circumvented to produce extensive WMD assistance.
Kudos to the @SarvamAI team for leading the charge in sovereign AI development and advancing Indic language technology. Leaving the naysayers aside, this is a great step forward in growing India's AI capabilities. We need more people to work in a low resource and inclusive way.
Today we introduce Sarvam-M, a 24B open-weights hybrid model built on top of Mistral Small. Sarvam-M achieves a new benchmark across a range of Indian languages, math, and programming tasks, for a model of its size. Here is a detailed technical blog on how we customize…
Today is a big day for AI Safety. We released Claude Opus 4 under the ASL-3 deployment standard Here's what that means:
Introducing the next generation: Claude Opus 4 and Claude Sonnet 4. Claude Opus 4 is our most powerful model yet, and the world’s best coding model. Claude Sonnet 4 is a significant upgrade from its predecessor, delivering superior coding and reasoning.
Going through any system prompt, be it Claude, Gemini, ChatGPT, or any other, makes the system look like a rule-based system rather than an intelligent one. If so many instructions have to be hard coded basically, where is the intelligence?
That's the catch, it's not good at all if you know how to code.
If AI is so good at coding, why is every AI app so broken?
I don't think the NeurIPS submission ID directly correlates to the number of submissions. But maybe that's just me being optimistic 🙃 @NeurIPSConf #NeurIPS
Reviewer calling themselves an expert in the field when they have so obviously misunderstood the concept is hilarious to me. I'm being asked to evaluate things that either do not exist, or are present in the supplementary material already. Unexpected from @ICCVConference #ICCV
I heard something along the lines of "You need to know what's wrong in order to do no wrong". Guess this quantifies it. Teach a model about toxicity instead of removing it completely from the training data.
🔥 "Bad Data, Good Models?" A surprising take on LLM pretraining. This paper flips the script: pretraining with more toxic data can actually improve post-training control. Using Olmo-1B variants, the authors show that toxicity becomes more linearly separable—making…
Wonder what evals the @OpenAI team uses.🤔
GitHub CoPilot is one of the first commercially successful LLM products (predating ChatGPT). What was the secret? A robust eval suite! In this lightning lesson, @JnBrymn will reveal the eval techniques (and mistakes) from working on this product. maven.com/p/da8264/how-e…
Everyone's rushing to implement MCP but nobody's talking about the easiest way to do it. import gradio as gr demo = gr.Interface(fn=your_function, ...) demo.launch(mcp_server=True) # ← that's it 🤯 Your Gradio app is now an MCP server for any LLM. Learn how:👇