Alexia Jolicoeur-Martineau
@jm_alexia
Senior AI Researcher at the Samsung SAIT AI Lab 🐱💻 I build generative AI for images, videos, text, tabular data, weights, molecules, and video games.
We introduce ByteCraft 🎮, the world's first generative model of video games and animations through bytes. Text prompt -> Executable file Paper: github.com/SamsungSAILMon… Inference Code: github.com/SamsungSAILMon… 7B Model: huggingface.co/SamsungSAILMon… Blog: emygervais.github.io/2025/03/15/byt…
🚀 We’re excited to introduce Qwen3-235B-A22B-Thinking-2507 — our most advanced reasoning model yet! Over the past 3 months, we’ve significantly scaled and enhanced the thinking capability of Qwen3, achieving: ✅ Improved performance in logical reasoning, math, science & coding…
Too much AI updates today, it's making me dizzy!
🚀Introducing Hierarchical Reasoning Model🧠🤖 Inspired by brain's hierarchical processing, HRM delivers unprecedented reasoning power on complex tasks like ARC-AGI and expert-level Sudoku using just 1k examples, no pretraining or CoT! Unlock next AI breakthrough with…
AI TL;DR: 1. OpenAI gets gold medal on IMO 2025 2. Googlegets gold medal on IMO 2025 3. Kimi K2 is the best open source model ever 4. Qwen outperforms it All within a single week.
>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…
this is what is not small! boys spent so much time building the Qwen3-Coder after Qwen2.5-Coder. it is much bigger, but based on MoE, and way stronger and smarter than before! not sure we can say competitive with claude sonnet 4 but might be for sure a really good coding agent.…
>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…
🧵 Everyone is chasing new diffusion models—but what about the representations they model from? We introduce Discrete Latent Codes (DLCs): - Discrete representation for diffusion models - Uncond. gen. SOTA FID (1.59 on ImageNet) - Compositional generation - Integrates with LLM 🧱
Something about this 2d style creates a sense of depth and scale that I haven’t seen any actual game accomplish. Not Zelda, not Elden Ring
How about creating game assets with Midjourney?
Alibaba Qwen has just released a non-thinking model even more powerful than Kimi K2... And even better than Claude Opus 4 🤯 → 100% open source → Only 22B active parameters → Available for free in Qwen Chat All the links below
How to train a State-of-the-art agent model. Let's talk about the Kimi K2 paper.
🚨 The era of infinite internet data is ending, So we ask: 👉 What’s the right generative modelling objective when data—not compute—is the bottleneck? TL;DR: ▶️Compute-constrained? Train Autoregressive models ▶️Data-constrained? Train Diffusion models Get ready for 🤿 1/n
Interesting to see the praise for this AI-invented visual style. Everybody wants it, and indie devs are rushing to prototype it & make it real.
omw
Vibe coding GasStationBench rn. Models run a virtual gas station, adjusting prices, managing inventory, and handling customer feedback. GPT-4.1 and GPT-4o behave so differently. When a competitor lowered prices on "dutch chocolate," 4o would match the price but 4.1 would always…
the world isn't ready for GasStationBench
The sad robot in matharena.ai/imo/ is Grok 4. This shows again how careful one has to be with overblown claims from closed releases saying the usual "it's so over". Test contamination that cannot be checked makes benchs look great, but on novel problems, the crash comes.
Wait NVIDIA has just released new SOTA open source models?! Available in 4 sizes 1.5B, 7B, 14B and 32B that you can run 100% locally. - OpenReasoning-Nemotron - SOTA scores across many benchmarks - Tailored for math, science, code How to run it on your laptop and details below
Not Even Bronze: Evaluating LLMs on 2025 International Math Olympiad 🥉 matharena.ai/imo/ Nice blog post from the team behind MathArena: Evaluating LLMs on Uncontaminated Math Competitions (arxiv.org/abs/2505.23281) providing independent analysis of LLM performance on IMO.
Mystery of sleep solved A groundbreaking study from the University of Oxford has uncovered a biological trigger for sleep: stress inside mitochondria (the energy producing structures within brain cells). Researchers found that when mitochondria in specialized sleep regulating…