Aurko Roy (@aurko79)

Pinned

A

Aurko Roy@aurko79 · Jul 4

Excited to share what I worked on during my time at Meta. - We introduce a Triton-accelerated Transformer with *2-simplicial attention*—a tri-linear generalization of dot-product attention - We show how to adapt RoPE to tri-linear forms - We show 2-simplicial attention scales…

aurko79's tweet image. Excited to share what I worked on during my time at Meta.

- We introduce a Triton-accelerated Transformer with *2-simplicial attention*—a tri-linear generalization of dot-product attention

- We show how to adapt RoPE to tri-linear forms

- We show 2-simplicial attention scales…

25

96

808

513

132.0K

Aurko Roy Retweeted

G

Gil Kalai@GilKalai · 6 h

After 78 years, an exponential improvement for Ramsey numbers were found by Jie Ma, Wujie Shen, and Shengjie Xie. gilkalai.wordpress.com/2025/07/23/ama…

0

22

78

25

3.0K

A

Aurko Roy@aurko79 · 11 h

Really enjoyed reading this work! One way I tried to explain subliminal learning is drawing parallel to watermarking text which generally works by biasing generation at each step to a partition the partitioned token vocabulary (partitioning happens at every step using a private…

OOwain Evans@OwainEvans_UK · Jul 22

Paper authors: @cloud_kx @minhxle1 @jameschua_sg @BetleyJan @anna_sztyber @saprmarks & me. Arxiv pdf: arxiv.org/abs/2507.14805 Blogpost: alignment.anthropic.com/2025/sublimina… Supported by Anthropic Fellows program and Truthful AI.

0

1

10

7

2.0K

Aurko Roy Retweeted

T

Tanishq Abraham is at ICML@iScienceLuvr · Jul 22

The Invisible Leash: Why RLVR May Not Escape Its Origin "RLVR is constrained by the base model's support-unable to sample solutions with zero initial probability-and operates as a conservative reweighting mechanism that may restrict the discovery of entirely original solutions"…

3

31

165

124

14.0K

Aurko Roy Retweeted

L

Lin Yang@lyang36 · Jul 22

🚨 Olympiad math + AI: We ran Google’s Gemini 2.5 Pro on the fresh IMO 2025 problems. With careful prompting and pipeline design, it solved 5 out of 6 — remarkable for tasks demanding deep insight and creativity. The model could win gold! 🥇 #AI #Math #LLMs #IMO2025

55

99

995

392

251.0K

A

Aurko Roy@aurko79 · Jul 22

Fascinating! It seems base model with an inference time harness get gold already without deepthink.

LLin Yang@lyang36 · Jul 22

🚨 Olympiad math + AI: We ran Google’s Gemini 2.5 Pro on the fresh IMO 2025 problems. With careful prompting and pipeline design, it solved 5 out of 6 — remarkable for tasks demanding deep insight and creativity. The model could win gold! 🥇 #AI #Math #LLMs #IMO2025

5

7

75

14

9.0K

A

Aurko Roy@aurko79 · Jul 19

Feel like it kind of went under radar that Rohan joined Anthropic

rrohan anil@_arohan_ · Jun 14

You know where for sure! Anthropic.

0

1

22

1

24.0K

A

Aurko Roy@aurko79 · Jul 18

Got nerd sniped into checking out karpathy's nanoGPT github, I made the following changes to run 2-simplicial attention on my mac on Shakespeare: - 6 layers, 6 heads, 384 dim - reduced ctxt len to 32 - 2-simplicial attention with 32 x 32 x 32 window - run for 5000 steps…

aurko79's tweet image. Got nerd sniped into checking out karpathy's nanoGPT github, I made the following changes to run 2-simplicial attention on my mac on Shakespeare:

- 6 layers, 6 heads, 384 dim
- reduced ctxt len to 32
- 2-simplicial attention with 32 x 32 x 32 window
- run for 5000 steps…

8

17

252

183

30.0K

A

Aurko Roy@aurko79 · Jul 16

Interesting post. However, it seems to be in conflict with the most central problem in theoretical computer science: P vs NP ,which is exactly the question: is it fundamentally easier to verify a solution rather than solve a problem. Most people believe that verification is…

JJason Wei@_jasonwei · Jul 16

New blog post about asymmetry of verification and "verifier's law": jasonwei.net/blog/asymmetry… Asymmetry of verification–the idea that some tasks are much easier to verify than to solve–is becoming an important idea as we have RL that finally works generally. Great examples of…

11

19

219

125

30.0K

A

Aurko Roy@aurko79 · Jul 4

You just know they wanted to call this "2 Fast 2-Simplicial" so bad.

AAurko Roy@aurko79 · Jul 4

Excited to share what I worked on during my time at Meta. - We introduce a Triton-accelerated Transformer with *2-simplicial attention*—a tri-linear generalization of dot-product attention - We show how to adapt RoPE to tri-linear forms - We show 2-simplicial attention scales…

4

116

11

10.0K

Aurko Roy Retweeted

R

Rohan Ahluwalia@r0hanahluwalia · Jul 7

If we are thinking in terms of finite data, infinite compute, this is a really interesting read. Great work by @Happylemon56775. arxiv.org/pdf/2507.02754

0

2

7

2

1.0K

A

Aurko Roy@aurko79 · Jul 7

Last week at Meta - looking back on the last 3 months I spent there, feel lucky to have worked with some amazing folks: @vinaysrao, @saanarkethayan, @_t_chou, @__yjc_, @_arohan_, @agarwl_, @brandfonbrener, @afrozenator, @dvsaisurya, @manzilzaheer Excited for what's next!

13

4

190

32

23.0K

Aurko Roy Retweeted

�

🇺🇦 Dzmitry Bahdanau@DBahdanau · Jul 6

woooow, that's so similar to our Edge Transformer!! happy to know that in the end smth like that can work if well-executed and thanks for citing us :)

0

2

11

2

3.0K

Aurko Roy Retweeted

L

Lisan al Gaib@scaling01 · Jul 6

I'm attention-pilled more complex attention mechanisms are the way forward even if it hurts context length, humans can't do long context either

13

3

45

7

5.0K

A

Aurko Roy@aurko79 · Jul 5

-Clear value proposition - Simple intuitive idea - Thorough execution - Cool results Bonus: from excellent researchers

AAurko Roy@aurko79 · Jul 4

Excited to share what I worked on during my time at Meta. - We introduce a Triton-accelerated Transformer with *2-simplicial attention*—a tri-linear generalization of dot-product attention - We show how to adapt RoPE to tri-linear forms - We show 2-simplicial attention scales…

0

1

42

15

5.0K

Aurko Roy Retweeted

s

surya@suryaasub · Jul 4

our ai codesign team is so cracked 🔥

0

1

3

0

932

Aurko Roy Retweeted

T

Tom Burns@tfburns · Jul 4

If you are interested in this topic, you might like my related paper, called "Simplicial Hopefield networks" with Tomoki Fukai, which also showed some theoretical results for why this type of network should be better, can be further expanded upon, and also made more efficient.

2

23

12

4.0K

A

Aurko Roy@aurko79 · Jun 21

💪

RRitvik Kapila@RitvikKapila · Jun 20

#1 trending on @huggingface letsgoooo! @essential_ai 🥇

3

13

190

42

43.0K