Tamay Besiroglu

@tamaybes

Working to fully automate software engineering @MechanizeWork

Joined May 2018

881Following

11KFollowers

Pinned

“life, liberty & the preservation of happiness” → “life, liberty & the pursuit of happiness” I like this edit because it shifts emphasis from safeguarding ("preservation," cautious, European) to striving ("pursuit," distinctly American, growth-oriented).

HHistory Calendar@historycalendar · Jul 4

Thomas Jefferson’s rough draft copy of the Declaration of Independence

3.0K

Tamay Besiroglu@tamaybes · 8 h

If you give your AI model a French name, it is perhaps not surprising it will be offline 20% of the year.

UUday Mungalpara@udaymungalpara · 17 h

Claude going down is the new normal now. @AnthropicAI #claude

143

8.0K

Tamay Besiroglu@tamaybes · Jul 20

Many software engineers want to move into AI but think they need to learn ML first. We're offering researcher-level pay with zero AI background required.

MMechanize@MechanizeWork · Jul 20

We're hiring software engineers. $500k base. x.com/i/jobs/1919892…

9.0K

Tamay Besiroglu@tamaybes · Jul 20

insane team. insane work. insane trajectory.

MMechanize@MechanizeWork · Jul 20

We're hiring software engineers. $500k base. x.com/i/jobs/1919892…

4.0K

Tamay Besiroglu@tamaybes · Jul 16

Reproducing the Chinchilla paper by parsing the svg files in the paper was a fun project. Too bad Google never issued an erratum or followed through on their promise to release the data.

TTamay Besiroglu@tamaybes · Apr 17, 2024

The Chinchilla scaling paper by Hoffmann et al. has been highly influential in the language modeling community. We tried to replicate a key part of their work and discovered discrepancies. Here's what we found. (1/9)

5.0K

Tamay Besiroglu Retweeted

Ege Erdil@EgeErdil2 · Jul 13

i disagree. which people were actually harmed by "mecha-hitler"? what is the standard you're using here for "doing better"? imo xai is doing better because they released a nice product earlier

17.0K

Tamay Besiroglu@tamaybes · Jul 12

My guess is that big tech companies increasingly opting to poach key personnel without acquiring the whole startup is driven by antitrust concerns. If true, this means that antitrust regulation adds meaningful equity risk for startup employees, which is unfortunate.

DDeva Hazarika@devahaz · Jul 11

Welcome Windsurf to this list of totally serious independent companies

416

102.0K

Tamay Besiroglu@tamaybes · Jul 11

Especially pertinent blog post now that Grok 4 supposedly increased RL compute to the level of pretraining compute without deriving any overwhelming increases in performance as a result.

MMechanize@MechanizeWork · Jul 11

Despite being trained on more compute than GPT-3, AlphaGo Zero could only play Go, while GPT-3 could write essays, code, translate languages, and assist with countless other tasks. That gap shows that what you train on matters. Rich RL environments are now the bottleneck.

751

385

120.0K

Tamay Besiroglu@tamaybes · Jul 8

I appreciate the 2027 team paying bounties for finding mistakes in their model. But what's far more important is catching these errors yourself before widely promoting your work on big platforms like the NYT, Dwarkesh, etc.

SStefan Schubert@StefanFSchubert · Jun 20

Great reaction to criticism from the AI 2027 team - not only a graceful response but also "a $500 bounty to represent our appreciation".

3.0K

Tamay Besiroglu@tamaybes · Jul 5

The Big Beautiful Bill lets U.S. hyperscalers and AI labs fully expense GPUs and training upfront, likely providing tens of billions in subsidies for compute through Trump’s term. Surprised this isn't getting more attention; journalists frame the bill as bad for AI somehow?

TTamay Besiroglu@tamaybes · Jul 4

Under the big, beautiful bill, AI training compute expenses qualify as R&D, making them immediately deductible in full during the year they're incurred.

210

36.0K

Tamay Besiroglu@tamaybes · Jul 2

If your interviewing and vetting process for early hires doesn’t catch frauds, then you’re probably doing something wrong. Get references, fly them out, reconstruct their past work on the Wayback Machine to see how productive they are, etc.

2.0K

Tamay Besiroglu@tamaybes · Jul 1

You know you've had an impact when leadership decides the only path forward is a clean-room team reboot.

bbidhan@bidhanxyz · Jun 30

how come Yann Lecun is not part of the superintelligence team.

5.0K

Tamay Besiroglu@tamaybes · Jun 23

Economists really don't like to entertain the possibility that AI might actually generate new knowledge, and instead insist on dubious claims like it can only "interpolate between known points of knowledge."

tamaybes's tweet image. Economists really don't like to entertain the possibility that AI might actually generate new knowledge, and instead insist on dubious claims like it can only "interpolate between known points of knowledge."

134

40.0K

Tamay Besiroglu@tamaybes · Jun 21

How much do we need to scale RL to enable its GPT-3 moment? We expect we'll soon need roughly 10,000 years of human-equivalent task time, comparable to the cumulative effort behind major projects like GTA V or Windows 2008.

MMechanize@MechanizeWork · Jun 21

Before GPT-3, achieving good performance required specialized fine-tuning for each task. Today's RL is similar: models need to be carefully trained to handle tasks like deep research, web search, or coding. But we think RL will soon have its GPT-3 moment.

10.0K

Tamay Besiroglu Retweeted

Mechanize@MechanizeWork · Jun 21

475

288

64.0K