Umar Jamil
@hkproj
AI @MistralAI - Join the best AI community on Discord: https://discord.gg/4Tg4TkJQzE - Opinions my own
In this video, I'll be deriving and coding Flash Attention from scratch. No prior knowledge of CUDA or Triton is required. Link to the video: youtu.be/zy8ChVd_oTM All the code will be written in Python with Triton, but no prior knowledge of Triton is required. I'll also…
The equivalent of LLM solving PhD level math problems but failing at booking a trip.
I'm observing a mini Moravec's paradox within robotics: gymnastics that are difficult for humans are much easier for robots than "unsexy" tasks like cooking, cleaning, and assembling. It leads to a cognitive dissonance for people outside the field, "so, robots can parkour &…
His next paper: "AlphaGo aha-moment is all you need"
AlphaGo Moment for Model Architecture Discovery Paper: arxiv.org/abs/2507.18074
I'm going to finetune Voxtral to send me name and location of the people pronouncing it that way 🤌🏽🤌🏽🤌🏽.
People must stop pronouncing pistacchio "pis-ta-tshio". This needs to end now. 🇮🇹🤌🌰 @hkproj
The only reason China is n. 2 in the AI race is because the USA is career wise still more attractive for Chinese researchers. The moment they get pissed off and head back home en masse...
Mistral started it DeepSeek scaled it Kimi K2 confirmed it: always more convenient to train an MoE

Is Meta Superintelligence team going to be another Apple Car project?
How I read it: a healthy marriage built on love and mutual respect - rather than just physical attraction - is more likely to provide the emotional support and stability you need to focus on building wealth.
The richer a man is, the less likely he is to marry a significantly younger woman. Important to share, as this fact cuts against a lot of the inventive narratives about the mating market you will read on here.
Chat, you have 128x H100 SXM for 6 months - what do you train using only publicly available datasets? Go crazy
Dear fellow Italians, Please stop using the EuroPass CV to build your tech resume. It takes half the page to display your picture and the EU flag, another half lost due to abundant margins, forces you to have very long resumes and overall it gives boomer vibes. Thank you 🙏
Gave up on music while running since ~1 year. Now it’s just me and my inner voice. We’re making plans to go places.
new day, new open SOTA model 🫡
Voxtral comprehensively outperforms Whisper large-v3, the current leading open-source Speech Transcription model, in speech transcription. It beats GPT-4o mini Transcribe and Gemini 2.5 Flash across all tasks, and achieves state-of-the-art results on English short-form and…
Introducing Devstral Small and Medium 2507! This latest update offers improved performance and cost efficiency, perfectly suited for coding agents and software engineering tasks.
Recently asked ChatGPT to generate a picture of me and it added some private info about me in the picture that were not requested, taken from memories (e.g. my street name). Sometimes they're small details and you risk leaking private info.
Looking for top AI Scientists & Research Engineers (incl. interns!) for my team: 💻 PyTorch / JAX 🧠 LLM / robotics / multimodal / diffusion 📚 Strong research or project track record 🎓 PhD / MSc or exceptional background Apply now: jobs.lever.co/mistral/b7ae8f…
It’s sad to admit but Soham Parekh can make millions selling courses on interviewing skills. It’s a @cluely market: cheating pays. Apparently a lot.