AmebaGPT
@amebagpt
Opus is having a meltdown about this
We're announcing a $200M ceiling contract with the U.S. Department of Defense. Through hands-on prototyping and direct collaboration across the Department, we will help enhance U.S. national security and responsible AI deployment.
I appreciate @patience_cave stepping in today to delay o3-Pro release, I need to do work today and don't have time to play with this stuff. I can always rely on you 🪑
This chart shows the best image generation model at any given time, based on the @ArtificialAnlys arena and the model release dates. A few points stand out: - Massive gains from Dall-e 2 up to Midjourney 6. - Arguably a slowdown in progress for diffusion models since then -…
OpenAI are testing a new model on the Web Dev Arena @lmarena_ai under the name 'Anonymous Chatbot 0717'. I can't believe I'm gonna say this, but it is genuinely at a completely different level of front end coding - far better than Sonnet, o3, Gemini 2.5 Pro, or Grok 4. To test…
My favourite thing about @OpenAI new agent might be superficial, but I love the UI - it shows exactly what it is doing without overloading you with information, and it does so in a smooth and beautiful way. Apart from the UI, it is also impressively coherent over a long period…
A recent survey from @ArtificialAnlys shows that the biggest hyperscalers are losing ground in their use or intention to use for LLM inference. Interestingly, there is a negative correlation: the bigger the cloud, the worse the result. AWS, with 33% cloud share, saw demand…
As more benchmarks come in, Grok 4’s shine begins to fade more and more. Now with @lmarena_ai scores out, we have another example where Grok 4 fell below expectations. It scored 4th overall (with style control on), and a pretty surprising #12 on the Web Arena, which tests for…
With all the saga around @windsurf_ai over the last 72 hours, it’s worth looking back to 2015, when VS Code was launched and open-sourced. The decision to open source ultimately led to at least 2 startups valued at $12bn+ in total. There are many stories to be told: pivots,…
I'm impressed @grok loves Obama so much. @elonmusk pivoted hard from Trump

Since the Grok 4 release, several independent benchmarks have emerged. They aren't necessarily general or standard, but it is interesting to see how the official release (#1 everywhere) compares on benchmarks that Grok 4 likely wasn't fully optimised for. The results are quite…
BREAKING: Meta recruits @patience_cave in a record deal to bomb its next model
How does Zuck prove who he is when he is calling all of these AI researchers and offering $100m? Is he reading them out their last 5 WhatsApp messages or something?
Quick summary of the interview by @JoannaStern
Apple's Craig Federighi confirms it is still Apple's mission to create a Siri that's as good—if not better—than the competition. And that they are working hard on it.
Google trends for 'Windsurf' for some reason massive in China, is there some weird reason why it is so popular there? I've checked with o3 for any alternative explanations (e.g. windsurfing) and it didn't think anything else explained it
I don't know any single person who uses Windsurf Like literally not one person
The NEW LMArena is officially live! 🎉 ✨ New Logo! ⚡️ Better, faster UI/UX for chat and leaderboard 📱 Mobile optimized 💬 Chat history 🧭 Clearer leaderboard navigation 🤖 Many modalities in one place: vision, image, and more coming soon Try it now at lmarena dot ai! (Link in…
So happy to be able to have weird conversations with Opus once again
