Pramod Goyal

@goyal__pramod

Trying to change the world one line at a time AI Dev zetta global prev @joindimension founder @hacktogetherdev

Maryland, USA

Joined November 2023

235Following

7KFollowers

Pinned

Pramod Goyal@goyal__pramod · May 11

Most influential LLM papers and the ideas they introduced (post 2017) A long thread 🧵

347

2.0K

4.0K

265.0K

Pramod Goyal@goyal__pramod · 10 h

I learned a fascinating thing. I used to think that KV caching is the name of the method, and you cache everything except the newest token output. But that is not the case; I cannot wrap my head around the matrix multiplication. But after I do, I will write about it.

PPramod Goyal@goyal__pramod · 10 h

I was working on kv caching and found an interesting short HF write up on it. If you wanna grasp the concept quickly, I will recommend checking it out.

4.0K

Pramod Goyal@goyal__pramod · 11 h

If you have any recommendations, do share. I would love to check em out!!

PPramod Goyal@goyal__pramod · 11 h

Working on a small reading section where I will add my favourite niche blogs, repos, tutorials, and books. That aren't very popular but extremely helpful.

506

Pramod Goyal@goyal__pramod · 11 h

Working on a small reading section where I will add my favourite niche blogs, repos, tutorials, and books. That aren't very popular but extremely helpful.

goyal__pramod's tweet image. Working on a small reading section where I will add my favourite niche blogs, repos, tutorials, and books.

That aren't very popular but extremely helpful.

2.0K

Pramod Goyal Retweeted

Owain Evans@OwainEvans_UK · 24 h

New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵

208

818

6.0K

4.0K

1.0M

Pramod Goyal Retweeted

Paul Couvert@itsPaulAi · 18 h

Wait so Alibaba Qwen has just released ANOTHER model?? Qwen3-Coder is simply one of the best coding model we've ever seen. → Still 100% open source → Up to 1M context window 🔥 → 35B active parameters → Same performance as Sonnet 4 They're releasing a CLI tool as well ↓

288

3.0K

2.0K

292.0K

Pramod Goyal Retweeted

Guan Wang@makingAGI · Jul 21

🚀Introducing Hierarchical Reasoning Model🧠🤖 Inspired by brain's hierarchical processing, HRM delivers unprecedented reasoning power on complex tasks like ARC-AGI and expert-level Sudoku using just 1k examples, no pretraining or CoT! Unlock next AI breakthrough with…

164

490

4.0K

3.0K

990.0K

Pramod Goyal Retweeted

Tushar Bansal@tushar_bans · Jul 21

Generative video is incredible, but ask it to explain a simple idea, and it often fails! Today we’re excited to introduce Programmatic Storytelling – a whole new way to craft videos and tell stories, built on vectors and code, not just pixels. @genime_labs

135

107

93.0K

Pramod Goyal Retweeted

Kimi.ai@Kimi_Moonshot · Jul 22

Kimi K2 tech report just dropped! Quick hits: - MuonClip optimizer: stable + token-efficient pretraining at trillion-parameter scale - 20K+ tools, real & simulated: unlocking scalable agentic data - Joint RL with verifiable + self-critique rubric rewards: alignment that adapts -…

235

2.0K

412

76.0K

Pramod Goyal Retweeted

Sophia Yang, Ph.D.@sophiamyang · Jul 20

How to train a model that actually understands both audio and text like Voxtral from @MistralAI? Here is a quick video walkthrough of the paper.

135

1.0K

889

48.0K

Pramod Goyal Retweeted

Tanishq Abraham is at ICML@iScienceLuvr · Jul 21

Kimi K2 paper dropped! describes: - MuonClip optimizer - large-scale agentic data synthesis pipeline that systematically generates tool-use demonstrations via simulated and real-world environments - an RL framework that combines RLVR with a self- critique rubric reward mechanism…

170

968

596

56.0K

Pramod Goyal Retweeted

Google DeepMind@GoogleDeepMind · Jul 21

An advanced version of Gemini with Deep Think has officially achieved gold medal-level performance at the International Mathematical Olympiad. 🥇 It solved 5️⃣ out of 6️⃣ exceptionally difficult problems, involving algebra, combinatorics, geometry and number theory. Here’s how 🧵

143

712

4.0K

677

973.0K

Pramod Goyal Retweeted

Aishwarya@aishwarya_2x21 · Jul 20

and it's done.. scratch implementations for > Tensor > Parameter and Module > Linear layer > Relu > Sequential > SGD optimizer

232

11.0K

Pramod Goyal@goyal__pramod · Jul 21

A visual prompt ablation, that's awesome!!!

PPramod Goyal@goyal__pramod · Jul 21

They have a short but equally amazing Stable Diffusion visualizer My life would have been so much simpler if I found this earlier

1.0K

Pramod Goyal@goyal__pramod · Jul 21

They have a short but equally amazing Stable Diffusion visualizer My life would have been so much simpler if I found this earlier

PPramod Goyal@goyal__pramod · Jul 21

The same guys have an insanely amazing GAN visualizer

3.0K