Ben Klieger
@benklieger
compound ai lead @groqinc, cs & research @stanford | opinions are my own
Connect Kimi K2 to live docs for 30,000 popular frameworks 🔥 Most AI coding assistants, while impressive, have a significant gap — outdated implementation knowledge. I spend this weekend building a new open source project to fix that: Meet CodeWizard, Kimi K2 running fast on…
Exciting to see a new kind of search API, integrating browser use (the tool, though in this case, the company too!) to ensure up to date results. It’s an interesting alternative to scraping, rerankers, maintaining an index, and semantic search.
❌ Do NOT use OpenAI for real-time web search - they index the web for fast yet inaccurate replies. 🚀 We’re releasing Browser Use Search API today. It crawls sites and fetches real-time data by interacting with any website. Built because you asked for it. We have already…
“Their solutions were astonishing in many respects. IMO graders found them to be clear, precise and most of them easy to follow” This is perhaps just as important as the score, in my opinion. Exciting progress!
IMO Gold for Gemini 🥇 using an advanced version of Gemini + Deep Think. Huge progress from last year's silver medal using domain specific models. Much more to share soon : )
This looks great! Keeping an eye out for the repo
We're open sourcing Firecrawl Observer in 3 days 👀 Monitor any page or entire sites with @firecrawl_dev's powerful change detection. Set custom intervals and get webhook alerts instantly when anything updates. Built with @vercel, @convex_dev, @Groq, and more. Stay tuned 👀
Incredibly well-written article coauthored by @marmikch explaining mechanistic interpretability - why do LLMs behave the way they behave, at the most foundational (mechanistic) level? How does an LLM think? Well worth a read.
"how neural networks think at scale" neural networks are a black box mech interp allows you to peek inside the black box in a new blog, we explain how models represent meaningful concepts how are these concepts arranged in the network and what are the building blocks for…
9 out of the 10 fastest-growing LLMs this week are open-source
Kimi K2 + @GroqInc vibe coding a collection of hypnotic, infinite loop animations that are mathematically generated and fully interactive
Great read, highly recommend. Jevons paradox for creativity: “An optimistic case is that generative AI is like an electric bike for our creative minds. You’d think electric bikes lead to less exercise, but people often get more because they ride more frequently and go farther.…
What if AI isn't replacing human creativity, but simply amplifying it? Glenn Fajardo drops some insights on how we can evolve our creative process alongside AI: stanford.io/3GYjJ1D #CreativityAI #Design #DesignThinking #AI
Spend limits!
PSA: The team has been working hard to grind out features you've been asking for and next on the list is budgeting. @GroqInc console now allows for setting up spend limits and email notifications for tracking usage. 💰
🚨 BREAKING: @Kimi_Moonshot’s Kimi-K2 is now the #1 open model in the Arena! With over 3K community votes, it ranks #5 overall, overtaking DeepSeek as the top open model. Huge congrats to the Moonshot team on this impressive milestone! The leaderboard now features 7 different…
🚀 Hello, Kimi K2! Open-Source Agentic Model! 🔹 1T total / 32B active MoE model 🔹 SOTA on SWE Bench Verified, Tau2 & AceBench among open models 🔹Strong in coding and agentic tasks 🐤 Multimodal & thought-mode not supported for now With Kimi K2, advanced agentic intelligence…
We compared Kimi K2 from @GroqInc with O3 from @OpenAI on @browser_use (K2 on top) K2 is lightning fast on @GroqInc ⚡️⚡️⚡️
Interested in LLM evaluation reliability & efficiency? Check our ICML’25 paper Reliable and Efficient Amortized Model-based Evaluation arxiv.org/abs/2503.13335 w/ @percyliang @uiuc_aisecure @sanmikoyejo @yuhengtu @VirtueAI_co @StanfordAILab @stai_research @StanfordCRFM 🧵1/9
Oh, and we launched a 1T parameter model 😎 x.com/dsllwn/status/…
Need to check if @GroqInc was pregnant cuz they just delivered
We officially have a near-frontier open-source model running on @GroqInc at 185 tok/s. It’s only going to get faster from here. This is going to open up a lot of opportunities.
We’ve ben seeing a lot of demand for Kimi K2 on @GroqInc. Happy to say that it’s now available on the Groq API at 185 tokens per second, 6x faster than any other provider (AT FULL CONTEXT)
👀
*YOLO Launch* Kimi K2 is now in preview on GroqCloud at 185 tokens/sec. Build fast. Link in comments.