Yasir
@0xyaza
Proud to introduce Group Sequence Policy Optimization (GSPO), our stable, efficient, and performant RL algorithm that powers the large-scale RL training of the latest Qwen3 models (Instruct, Coder, Thinking) 🚀 📄 huggingface.co/papers/2507.18…
Spot on.
This week we got: *Qwen3-235B-A22B-Instruct-2507 *Qwen3-235B-A22B-Thinking-2507 *Qwen3-Coder-480B-A35B-Instruct All open weights and Apache 2.0 licensed. I feel this marks the point where we can now have hugely effective models running locally. Someone needs to make consumer…
This week we got: *Qwen3-235B-A22B-Instruct-2507 *Qwen3-235B-A22B-Thinking-2507 *Qwen3-Coder-480B-A35B-Instruct All open weights and Apache 2.0 licensed. I feel this marks the point where we can now have hugely effective models running locally. Someone needs to make consumer…
I swear if @AMD just makes a desktop SoC with 256GB or 512GB of unified RAM, they'd own the personal computing market lol.
Qwen on a ROLL! Thinking model that beats Gemini 2.5 Pro, O4 mini AND DeepSeek R1 too 🔥
Testing Qwen3-coder with the new 480B-instruct model on @hyperbolic_labs and it's been 🤌🤌🤌 so far.
I love when Claude tells me something will take 6 weeks. No dude, we are doing it this afternoon.
dspy now works great in the browser. try the cool new dspy notebook with built in local in browser models. @DSPyOSS @lateinteraction you folks will like this.
Very straightforward use of @DSPyOSS. Very nice example @MaximeRivest
My tutorial on: How to build an Automatically Branching Chat with DSPy is out! With DSPy, I am able to easily rein in the chaos of LLM outputs and actually use LLM generation in my coding logic. No manual parsing or fiddling with strings required. If you follow along and run…
I think @kirodotdev likely has the late entrant advantage. They've identified all the issues we've experienced using agentic coding, and nailed the user flow. IMHO @TaskmasterAI is the OG that influenced controlling the erratic behavior of all coding tools, but I have to say…
the weather is nice and all, but have you configured your @zeddotdev ide to use @Kimi_Moonshot Kimi-K2 yet using @GroqInc ?? The speed is ridiculous.
If there is one thing I've been told over and over again, but did not understand until it happens many times over...any and all opportunities are time-bound. There is a time limit to when the ROI on taking that opportunity is no longer worth it. Courage and speed are the defining…
When's @GroqInc inference speed meets their deployment speed....this is awesome and already using it heavily!
the colossal giant is here. @kimi_moonshot's kimi v2 with 1t parameters is now on @groqinc for instant tool calling for your coding agents. full context available for all. full speed ahead. 🫡
the colossal giant is here. @kimi_moonshot's kimi v2 with 1t parameters is now on @groqinc for instant tool calling for your coding agents. full context available for all. full speed ahead. 🫡
anyone else want kimi on groq or
📢 If you’re at #SIGIR2025 this week, make sure to be at Luca Scheerer’s paper talk: “WARP: An Efficient Engine for Multi-Vector Retrieval” (Wednesday 11am) WARP makes PLAID, the famous ludicrously fast ColBERT engine, another 3x faster on CPUs. With the usual ColBERT quality!
This prompt injection screenshot is circulating. From an abstraction standpoint, it's another argument for Signatures. Signatures separate the fixed task spec (instructions + I/O schema) from the variable input data, and assign a semantic role for each input. That's in…
The race for LLM "cognitive core" - a few billion param model that maximally sacrifices encyclopedic knowledge for capability. It lives always-on and by default on every computer as the kernel of LLM personal computing. Its features are slowly crystalizing: - Natively multimodal…
I’m so excited to announce Gemma 3n is here! 🎉 🔊Multimodal (text/audio/image/video) understanding 🤯Runs with as little as 2GB of RAM 🏆First model under 10B with @lmarena_ai score of 1300+ Available now on @huggingface, @kaggle, llama.cpp, ai.dev, and more
Bonus if you use @GroqInc with any of the optimization techniques (miprov2 etc) - the speedup gain is an easy 10x
Not enough people know this: Every DSPy optimizer has ALWAYS natively allowed you to tune any complex program with any many LLM calls and any structure you want. Multi-turn agents? Multi-module compound AI systems? Just use MIPRO or GRPO for prompt opt or RL the whole system!
Not enough people know this: Every DSPy optimizer has ALWAYS natively allowed you to tune any complex program with any many LLM calls and any structure you want. Multi-turn agents? Multi-module compound AI systems? Just use MIPRO or GRPO for prompt opt or RL the whole system!
how do you optimize a multi-prompt pipeline (deep-research style) with DSPy?