Pete Shaw

@ptshaw2

Research Scientist @GoogleDeepmind

Seattle, WA

Joined January 2013

438Following

583Followers

Pinned

Pete Shaw@ptshaw2 · Oct 24

Excited to share a new paper: “ALTA: Compiler-Based Analysis of Transformers” (w/ @James_Cohan, @jacobeisenstein, @kentonctlee, @JonathanBerant, @toutanova) arxiv.org/abs/2410.18077

ptshaw2's tweet image. Excited to share a new paper: “ALTA: Compiler-Based Analysis of Transformers” (w/ @James_Cohan, @jacobeisenstein, @kentonctlee, @JonathanBerant, @toutanova)

arxiv.org/abs/2410.18077

7.0K

Pinned

Pete Shaw@ptshaw2 · Jul 9

AgentRewardBench will be presented at @COLM_conf 2025 in Montreal! See you soon and ping me if you want to meet up!

XXing Han Lu@xhluca · Apr 15

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories We are releasing the first benchmark to evaluate how well automatic evaluators, such as LLM judges, can evaluate web agent trajectories. We find that rule-based evals underreport success rates, and…

4.0K

Pete Shaw Retweeted

Lucas Saldyt@SaldytLucas · May 29

Neural networks can express more than they learn, creating expressivity-trainability gaps. Our paper, “Mind The Gap,” shows neural networks best learn parallel algorithms, and analyzes gaps in faithfulness and effectiveness. @rao2z

262

291

27.0K

Pete Shaw Retweeted

Jacob Eisenstein@jacobeisenstein · May 22

We're hiring a research scientist on the Foundational Research in Language team at GDM. The role is right here in sunny Seattle! job-boards.greenhouse.io/deepmind/jobs/…

5.0K

Pete Shaw@ptshaw2 · Apr 19

This was my first time submitting to TMLR, and thanks to the reviewers and AE @murefil for making it a positive experience! TMLR seems to offer some nice pros vs. ICML/ICLR/NeurIPS, eg: - Potentially lower variance review process - Not dependent on conference calendar

AAccepted papers at TMLR@TmlrPub · Apr 14

ALTA: Compiler-Based Analysis of Transformers Peter Shaw, James Cohan, Jacob Eisenstein, Kenton Lee, Jonathan Berant, Kristina Toutanova. Action editor: Alessandro Sordoni. openreview.net/forum?id=h751w… #compiler #interpreter #programming

4.0K

Pete Shaw Retweeted

Xing Han Lu@xhluca · Apr 15

226

150

33.0K

Pete Shaw Retweeted

Jonathan Berant@JonathanBerant · Mar 20

Hi ho! New work: arxiv.org/pdf/2503.14481 With amazing collabs @jacobeisenstein @jdjdhekchbdjd @adamjfisch @ddua17 @fantinehuot @mlapata @vicky_zayats Some things are easier to learn in a social setting. We show agents can learn to faithfully express their beliefs (along... 1/3

9.0K

Pete Shaw Retweeted

Clare Lyle@clarelyle · Mar 17

📣📣 My team at Google DeepMind is hiring a student researcher for summer/fall 2025 in Seattle! If you're a PhD student interested in getting deep RL to (finally) work reliably in interesting domains, apply at the link below and reach out to me via email so I know you aplied👇

624

333

80.0K

Pete Shaw Retweeted

Xing Han Lu@xhluca · Mar 10

Agents like OpenAI Operator can solve complex computer tasks, but what happens when users use them to cause harm, e.g. automate hate speech and spread misinformation? To find out, we introduce SafeArena (safearena.github.io), a benchmark to assess the capabilities of web…

35.0K

Pete Shaw@ptshaw2 · Jan 23

Submit your work to the first Agent + Language workshop at @aclmeeting! We have a list of awesome speakers, and you'll have the chance to meet other researchers working on agents!

NNouha Dziri@nouhadziri · Jan 22

Interested in knowing more about LLMs agents and in contributing to this topic?🚀 📢We're thrilled to announce REALM: The first Workshop for Research on Agent Language Models 🤖 #ACL2025NLP in Vienna 🎻 We have an exciting lineup of speakers 🗓️ Submit your work by *March 1st*

1.0K

Pete Shaw@ptshaw2 · Dec 19

Excited to share our work on improving Gemini for learning!

GGoogle@Google · Dec 19

In a recent technical report, LearnLM, our set of AI models and capabilities fine-tuned for learning, outperformed other leading AI models on the principles of learning science. Now it’s available to try out in AI Studio. Learn more ↓ goo.gle/4gmEdxp

790

Pete Shaw@ptshaw2 · Dec 9

I'll be at NeurIPS this week. Please reach out if you would like to chat!

636

Pete Shaw Retweeted

Siva Reddy@sivareddyg · Nov 11

I have multiple vacancies for PhD and Masters students at @Mila_Quebec @McGill_NLP in NLP/ML focusing on representation learning, reasoning, multimodality and alignment. Deadline for applications is Dec 1st. More details: mila.quebec/en/prospective…

161

21.0K

Pete Shaw@ptshaw2 · Nov 11

The #AlphaFold 3 model code and weights are now available for academic use. We @GoogleDeepMind are excited to see how the research community continues to use AlphaFold to address open questions in biology and new lines of research. github.com/google-deepmin…

PPushmeet Kohli@pushmeet · May 13, 2024

We love the excitement & results from the community on AlphaFold 3 and are doubling the AF Server daily job limit to 20. Happy to also share that we're working on releasing the AF3 model (incl weights) for academic use, which doesn’t depend on our research infra, within 6 months.

263

1.0K

199

185.0K

Pete Shaw@ptshaw2 · Nov 11

Trying out Bluesky: ptshaw.bsky.social

Pete Shaw@ptshaw2 · Oct 29

The RASP-L conjecture was one of the reasons I became interested in languages that compile to Transformers, leading to our recent work on ALTA. Looking forward to reading this one!

HHattie Zhou@oh_that_hat · Oct 29

We finally formalize the RASP-L conjecture in this work! - theoretical guarantee on generalization for C-RASP tasks - validated on 8 algo and 17 finite-state languages - C-RASP is based on communication complexity, can only transfer O(logN) bits b/w inputs Thread for details👇

346

Pete Shaw Retweeted

Dipanjan Das@dipanjand · Oct 28

I am hiring for a research engineering role in NYC, focused on Gemini post training. If you are interested, please apply here. Deadline is just in two weeks. boards.greenhouse.io/deepmind/jobs/…

380

288

68.0K