Aryan Vichare

@aryanvichare10

member of technical staff @lmarena_ai | prev. @Berkeley_EECS @vercel @v0 @aisdk

San Francisco

Joined February 2016

641Following

2KFollowers

Pinned

Aryan Vichare@aryanvichare10 · Feb 26

Amazing that Claude 3.7 is gapping so hard. Great job @AnthropicAI ! WebDev Arena has been a high-signal eval in my experience. It's very easy to distinguish models in their ability to create great websites.

llmarena.ai@lmarena_ai · Feb 26

BREAKING: Claude 3.7 Sonnet claims the #1 spot in WebDev Arena with a +100 score jump 🚀 over Claude 3.5 Sonnet! 🔥 Huge congrats to @AnthropicAI on this incredible milestone! Have you tried Claude 3.7 Sonnet in the WebDev Arena yet? Test it now (link below)

1.0K

Aryan Vichare@aryanvichare10 · 12 h

It's genuinely mind-boggling how good models are getting at one-shotting complex visualizations from simple prompts Prompt: "two black holes colliding animation" This model perfectly implemented: – 2-body gravity simulation – Dynamic particle accretion disks – Collision +…

566

Aryan Vichare Retweeted

lmarena.ai@lmarena_ai · Jul 23

🚨 BIG NEWS 🚨 Search Arena is live with 7 top models with search capabilities ready for testing. Be sure to have the "Search" modality selected in the chat box, and get testing. 🌐 @xAi: Grok 4 @anthropic: Claude Opus 4 @perplexity: Sonar Pro High & Reasoning Pro High…

491

143

57.0K

Aryan Vichare Retweeted

lmarena.ai@lmarena_ai · Jul 16

We’re delivering a bundle of polish to the LMArena experience, most of them inspired directly by your feedback 💬 Here’s a look at what’s new👇

7.0K

Aryan Vichare@aryanvichare10 · Jul 15

Thoughts on Grok 4 results in LMArena Grok's API model is tied for #3 overall with style control-remember, style control is default now in LMArena. Without style control, it's #2 overall. In Math, its preliminary ranking is tied for #1, along with Minimax-M1, Gemini-2.5-pro, and…

llmarena.ai@lmarena_ai · Jul 15

🚨 Breaking News: Grok 4's result is now live! With 4k+ community votes, xAI’s Grok-4 tied for #3 overall in Text Arena — a huge leap from Grok-3. It scores Top-3 across all categories (#1 in Math, #2 in Coding, #3 in Hard Prompts). Detailed analysis in the thread 🧵

7.0K

Aryan Vichare Retweeted

Andrej Karpathy@karpathy · Jul 13

Scaling up RL is all the rage right now, I had a chat with a friend about it yesterday. I'm fairly certain RL will continue to yield more intermediate gains, but I also don't expect it to be the full story. RL is basically "hey this happened to go well (/poorly), let me slightly…

412

849

8.0K

5.0K

1.0M

Aryan Vichare@aryanvichare10 · Jul 10

grok 4 is live on lmarena for everyone!

llmarena.ai@lmarena_ai · Jul 10

🚨 New contender enters the Arena: @xAI’s Grok-4 is live! Grok-4 debuts impressively at #1 across many hard benchmarks. Now it’s time to put it to the real-world test: challenge Grok-4 with your toughest prompts!

731

Aryan Vichare@aryanvichare10 · Jul 10

Can't wait to see Grok 4 Performance in WebDev Arena "Visualization of 2 Black Holes Colliding"

1.0K

Aryan Vichare@aryanvichare10 · Jun 12

Life update: excited to share I’ve joined @lmarena_ai as Member of Technical Staff! Excited to advance the future of AI progress through open, human-centered evaluation alongside such a talented team

aryanvichare10's tweet image. Life update: excited to share I’ve joined @lmarena_ai as Member of Technical Staff!

Excited to advance the future of AI progress through open, human-centered evaluation alongside such a talented team

116

8.0K

Aryan Vichare Retweeted

lmarena.ai@lmarena_ai · May 27

The NEW LMArena is officially live! 🎉 ✨ New Logo! ⚡️ Better, faster UI/UX for chat and leaderboard 📱 Mobile optimized 💬 Chat history 🧭 Clearer leaderboard navigation 🤖 Many modalities in one place: vision, image, and more coming soon Try it now at lmarena dot ai! (Link in…

599

116

122.0K

Aryan Vichare Retweeted

Dillion@dillionverma · Apr 5

✨ Introducing Refract - The Ad Engine. Create 100+ video ads for your business in seconds. Live demo below 👇

828

1.0K

89.0K

Aryan Vichare Retweeted

Sherjil Ozair@sherjilozair · Apr 2

Today I'm launching my new company @GeneralAgentsCo and our first product. Introducing Ace: The First Realtime Computer Autopilot Ace is not a chatbot. Ace performs tasks for you. On your computer. Using your mouse and keyboard. At superhuman speeds!

344

326

3.0K

2.0K

619.0K

Aryan Vichare Retweeted

lmarena.ai@lmarena_ai · Mar 10

Introducing our latest blog on WebDev Arena: A Live LLM Leaderboard for Web App Development! How does WebDev Arena work? Submit a prompt → Two LLMs battle it out → You vote on the better web app. Since launching in Dec 2024, we've gathered 100,000+ community votes evaluating…

163

27.0K

Aryan Vichare@aryanvichare10 · Mar 9

🌁

2.0K