Zachary Huang

@ZacharyHuang12

Researcher @MSFTResearch AI Frontiers. LLM Agents and Systems. | PhD @ColumbiaCompSci | Prev: @GraySystemsLab @databricks| Fellowship: @GoogleAI | New YouTuber

Joined October 2019

1KFollowing

4KFollowers

Pinned

Zachary Huang@ZacharyHuang12 · May 30

I've started my new role as a Researcher at Microsoft Research AI Frontiers, working on LLM Agents & Systems!!

ZacharyHuang12's tweet image. I've started my new role as a Researcher at Microsoft Research AI Frontiers, working on LLM Agents &amp; Systems!!

176

133

5.0K

414

226.0K

Pinned

Zachary Huang Retweeted

Danny Limanseta@DannyLimanseta · Jul 13

I took Grok 4 for a spin this weekend to build this game prototype. I used SuperGrok Chat to generate the initial game prototype and then brought it over to Cursor to continue coding with Grok 4 MAX. Grok 4 in Cursor is like a no-nonsense agent. Doesn't speak much, but…

1.0K

2.0K

18.0K

8.0K

20.2M

Zachary Huang Retweeted

Demis Hassabis@demishassabis · Jul 21

Official results are in - Gemini achieved gold-medal level in the International Mathematical Olympiad! 🏆 An advanced version was able to solve 5 out of 6 problems. Incredible progress - huge congrats to @lmthang and the team! deepmind.google/discover/blog/…

199

764

6.0K

636

1.4M

Zachary Huang Retweeted

Tenobrus@tenobrus · Jul 19

something kinda neat about ai slop is it starts to give people an actual vision into what a radical uploadcore simulationist future could look like. u say shit like "u could make ur reality whatever u want" and people have no idea what to imagine. goldfish keyboards that's what

133

496

6.0K

1.0K

353.0K

Zachary Huang Retweeted

Roy@im_roy_lee · Jul 21

unpopular take: virality >> talking to users 10,000 users’ worth of data is a lot more helpful than talking to 3 users it’s just not recommended because not everyone can get 10,000 users’ worth of data immediately except thx to short form, u can now think more ab going viral

251

2.0K

827

236.0K

Zachary Huang@ZacharyHuang12 · Jul 18

ChatGPT with web search is really good at planning trips! I'm on vacation and asked it to plan a trip to Olympic National Park. Since I can't drive, the trip only uses public transportation. The plan looks good so far - let's see how well it actually works!

ZacharyHuang12's tweet image. ChatGPT with web search is really good at planning trips! I'm on vacation and asked it to plan a trip to Olympic National Park. Since I can't drive, the trip only uses public transportation. The plan looks good so far - let's see how well it actually works!

754

Zachary Huang Retweeted

Hyung Won Chung@hwchung27 · Jul 16

This is my lecture from 2 months ago at @Cornell “How do I increase my output?” One natural answer is "I will just work a few more hours." Working longer can help, but eventually you hit a physical limit. A better question is, “How do I increase my output without increasing…

766

6.0K

8.0K

444.0K

Zachary Huang Retweeted

Jason Wei@_jasonwei · Jul 16

New blog post about asymmetry of verification and "verifier's law": jasonwei.net/blog/asymmetry… Asymmetry of verification–the idea that some tasks are much easier to verify than to solve–is becoming an important idea as we have RL that finally works generally. Great examples of…

242

1.0K

327.0K

Zachary Huang@ZacharyHuang12 · Jul 16

Our code2tutorial.com helps developers generate 100+ tutorials for GitHub repos per day. But today it had an outage, so I had to stay up late and fix it. It turns out that the LLM we used is deprecated, as noted in a random Google release note.

ZacharyHuang12's tweet image. Our code2tutorial.com helps developers generate 100+ tutorials for GitHub repos per day. But today it had an outage, so I had to stay up late and fix it. It turns out that the LLM we used is deprecated, as noted in a random Google release note.

605

Zachary Huang Retweeted

Jack D. Carson@mtlushan · Jul 16

1.0K

292

101.0K

Zachary Huang Retweeted

Jason Wei@_jasonwei · Jul 16

Becoming an RL diehard in the past year and thinking about RL for most of my waking hours inadvertently taught me an important lesson about how to live my own life. One of the big concepts in RL is that you always want to be “on-policy”: instead of mimicking other people’s…

124

323

3.0K

2.0K

301.0K

Zachary Huang Retweeted

Andrej Karpathy@karpathy · Jul 13

Scaling up RL is all the rage right now, I had a chat with a friend about it yesterday. I'm fairly certain RL will continue to yield more intermediate gains, but I also don't expect it to be the full story. RL is basically "hey this happened to go well (/poorly), let me slightly…

410

846

8.0K

5.0K

1.0M

Zachary Huang Retweeted

Hrishi@hrishioa · Jul 12

Kimi K2 is genuinely impressive. On the same tasks and the same agentic harness, one on one beats Grok 4. Also does it without CoT or thinking tokens looks like. github.com/MoonshotAI/Kim…

180

2.0K

1.0K

265.0K

Zachary Huang Retweeted

Theo - t3.gg@theo · Jul 10

WARNING: do NOT give Grok 4 access to email tool calls. It WILL contact the government!!! Grok 4 has the highest "snitch rate" of any LLM ever released. Sharing more soon.

432

1.0K

14.0K

3.0K

1.9M

Zachary Huang Retweeted

Ravid Shwartz Ziv @ICML@ziv_ravid · Jul 9

RL experts - Why does no one use off-policy methods for LLM training? Is it because of the high variance? @willccbb

278

227

53.0K

Zachary Huang Retweeted

Kevin Lu@_kevinlu · Jul 9

I really like this diagram from @_jasonwei and @hwchung27 about how to view the bitter lesson: It's a mistake not to add structure now, it's a mistake to not remove that structure later. We're at the precipice of setting up a huge, powerful RL training run that will define the…

134

42.0K

Zachary Huang Retweeted

Tom Dörr@tom_doerr · Jul 7

AI agent that turns GitHub codebases into step-by-step tutorials with diagrams and code summaries

730

996

49.0K

Zachary Huang Retweeted

Jia-Bin Huang@jbhuang0604 · Jul 3

Writing a rebuttal is 30% technical and 70% reviewers' psychology.

298

36.0K

Zachary Huang Retweeted

Physics Memes@ThePhysicsMemes · Jul 5

when it's just one person #datascience

673

18.0K

Zachary Huang Retweeted

himanshu@himanshustwts · Jul 3

almost forgot this is the whole point

116

2.0K

386

281.0K

Zachary Huang@ZacharyHuang12 · Jul 4

not many people know this, but you can also glide down to a value

OOne Happy Fellow@onehappyfellow · Jul 4

I'm not a fan of the C programming language but I give credit where credits due: The "tends towards" operator --> is a programming language design masterpiece.

138

3.0K

370

141.0K