Vision Transformers

@vitransformer

Building in ML with blogs 👇 | agentic workflows @lossfunk

India

Joined April 2024

778Following

1KFollowers

Vision Transformers@vitransformer · 3 h

Open source is back baby!

QQwen@Alibaba_Qwen · 6 h

🚀 We’re excited to introduce Qwen3-235B-A22B-Thinking-2507 — our most advanced reasoning model yet! Over the past 3 months, we’ve significantly scaled and enhanced the thinking capability of Qwen3, achieving: ✅ Improved performance in logical reasoning, math, science & coding…

206

Vision Transformers@vitransformer · 3 h

Omg finally!

GGrant Sanderson@3blue1brown · 3 h

New video on the details of diffusion models: youtu.be/iv-5mZ_9CPY Produced by @welchlabs, this is the first in a small series of 3b1b this summer. I enjoyed providing editorial feedback throughout the last several months, and couldn't be happier with the result.

139

Vision Transformers Retweeted

Burkay Gur@burkaygur · 13 h

Close-up shot. He was not shy about being GPU Rich for one bit.

5.0K

Vision Transformers Retweeted

Anindya@anindyadeeps · 9 h

The first AI-powered structure prediction editor, powered by Boltz-2 with bulk structure prediction, is now available on the LiteFold Platform. Links in comments. Exciting updates coming soon! 🚀

153

7.0K

Vision Transformers@vitransformer · Jul 23

hear me out...

T@ ·

529

Vision Transformers@vitransformer · Jul 23

omg omg omg I want prime gpus

PPrime Intellect@PrimeIntellect · Jul 23

163

Vision Transformers@vitransformer · Jul 23

i just would have gotten the claude max plan

hhimanshu@himanshustwts · Jul 23

the level of incompetence in this post is so confusing, i mean why would you use a transformer here in the first place?

302

Vision Transformers@vitransformer · Jul 23

how to get a job at @xai

EElon Musk@elonmusk · Jul 22

230k GPUs, including 30k GB200s, are operational for training Grok @xAI in a single supercluster called Colossus 1 (inference is done by our cloud providers). At Colossus 2, the first batch of 550k GB200s & GB300s, also for training, start going online in a few weeks. As Jensen…

573

Vision Transformers@vitransformer · Jul 23

ok but china has more!

AAyush@ayushjha__ · Jul 23

India has around 5,000 H100 GPUs in total. Elon Musk alone has 150,000. Let that sink in. We're not just behind in the AI race; we're not even on the track. At this rate, only a miracle can pull us back into the game. And if we lose this race now, forget about catching up…

233

Vision Transformers@vitransformer · Jul 22

my reaction after seeing this

KKaran Vaidya@KaranVaidya6 · Jul 22

Agents aren’t reliable. They don’t learn from experience. At @composiohq, we provide skills that evolve with your agents @lightspeedvp gave us $25M to make agents usable

888

Vision Transformers@vitransformer · Jul 22

inference seems cracked with torch.compile

SSemiAnalysis@SemiAnalysis_ · Jul 21

AI researchers when they discovered that torch.compile doesn't scale well to real multi-node production training workloads and is a giant footgun

210

Vision Transformers@vitransformer · Jul 22

you know AI is serious stuff when PW has a hf

642

Vision Transformers@vitransformer · Jul 22

(man this could have helped me in JEE) ;-;

SSachin@sachdh · Jul 22

Excited to share Aryabhatta 1.0, our leading model that scores 90.2% on JEE Mains, outperforming frontier models like o4 mini and Gemini Flash 2.5 Trained by us at @AthenaAgentRL , in collaboration with @physics__wallah, using custom RLVR training on 130K+ curated JEE problems…

404

Vision Transformers@vitransformer · Jul 22

yep it is a bit finicky in ai studio But i guess we all have a quant now

LLin Yang@lyang36 · Jul 22

🚨 Olympiad math + AI: We ran Google’s Gemini 2.5 Pro on the fresh IMO 2025 problems. With careful prompting and pipeline design, it solved 5 out of 6 — remarkable for tasks demanding deep insight and creativity. The model could win gold! 🥇 #AI #Math #LLMs #IMO2025

423

Vision Transformers@vitransformer · Jul 22

seems like everyone gave into the hype

HHarmonic@HarmonicMath · Jul 20

This past week, Harmonic had the opportunity to represent our advanced mathematical reasoning model, Aristotle, at the International Mathematics Olympiad - the most prestigious mathematics competition in the world. To uphold the sanctity of the student competition, the IMO Board…

238

Vision Transformers Retweeted

The Proxy Company@TheProxyCompany · Jul 21

The future of AI will not be metered. It will be owned by you. Our work continues in Brooklyn.

104

8.0K

Vision Transformers@vitransformer · Jul 22

seems like the new american dream > you get into a good uni > mess with them academically or ethically > get kicked out > start a company $$$ in the process you piss off the establishment and get SF VCs to notice you rescinded is the new dropout !

EEddy Xu@eddybuild · Jul 21

got rescinded from columbia

604

Vision Transformers@vitransformer · Jul 22

seems like a 33% cuda rest pytorch

ttender@tenderizzation · Jul 21

took a quick look at this paper (just the convolution section) and I have several concerns about the claims: 1) pytorch by default does not execute synchronously on the GPU (host vs. device) and anyone who has forgotten syncs when benchmarking can tell you so 2) TF32 is enabled…

359

Vision Transformers@vitransformer · Jul 21

�𝚐𝔪𝟾𝚡𝚡𝟾@gm8xx8 · Jul 21

CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning Trains a DeepSeek-v3-671B model to optimize CUDA kernels using only execution-time speedup as reward. Pipeline: - SFT: Finetuned on 2.1K correct, executable CUDA variants from 6 LLMs across 250…

8.0K

Vision Transformers@vitransformer · Jul 21

Llama2 7b kinda changed my life trajectory got an opportunity to do a pod with @Meta, @AIatMeta on Open source LLMs had a great dicussion with @sunil_abraham and @chheplo (folks i look up to) !!!

AAIM@Analyticsindiam · Jul 21

In this episode of AI Talks by AIM powered by @Meta with @sunil_abraham , Public Policy Director - Data Economy and Emerging Tech at Meta, India, we dive into a powerful conversation about how open-source generative AI is enabling real-world impact, far beyond Silicon Valley.…

4.0K