Seth Karten (@sethkarten)

Pinned

S

Seth Karten@sethkarten · 29 m

🚀 New preprint! 🤔 Can one agent “nudge” a synthetic civilization of Census‑grounded agents toward higher social welfare—all by optimizing utilities in‑context? Meet the LLM Economist ↓

sethkarten's tweet image. 🚀 New preprint!
🤔 Can one agent “nudge” a synthetic civilization of Census‑grounded agents toward higher social welfare—all by optimizing utilities in‑context? Meet the LLM Economist ↓

1

0

2

0

90

Seth Karten Retweeted

P

Princeton Laboratory for Artificial Intelligence@PrincetonAInews · Jul 18

Shoutout to all the @Princeton researchers participating in @icmlconf #ICML2025 Browse through some of the cutting edge research from AI Lab students, post-docs and faculty being presented this year: pli.princeton.edu/blog/2025/prin…

0

8

46

7

4.0K

S

Seth Karten@sethkarten · Jul 17

👾Are you interested in LLMs for two-player competitive games with partial information? Or perhaps just a Pokemon fan? Come check out our #ICML spotlight poster at 4:30PM in West Exhibition Hall B2-B3 #W-815

SSeth Karten@sethkarten · Jul 11

🚀 6 days until my ICML spotlight poster! Key insights we’ll unpack: • Base LLM + test-time planning • Game-theoretic scaffolding • Context-engineered opponent prediction • Comparative LLM-as-judge (relative > absolute) Catch me Thu Jul 17, 4:30-7 PM PT👇

0

5

31

12

3.0K

Seth Karten Retweeted

P

Princeton PLI@PrincetonPLI · Jul 16

We’re proud that PLI students, post-docs, and faculty will be featuring over 20 papers at the @icmlconf in Vancouver this week! From safer AI agents to long-context reasoning and RL, we’re excited to showcase the cutting edge research for you here: pli.princeton.edu/blog/2025/prin…

0

6

17

2

2.0K

S

Seth Karten@sethkarten · Jul 15

🚀 Huge milestone from our Goedel-Prover team: we’ve just released a new state-of-the-art model (8B & 32B) for automated theorem proving—surpassing the previous best 671B DeepSeek model by a wide margin, all with academic compute!

YYong Lin@Yong18850571 · Jul 15

(1/4)🚨 Introducing Goedel-Prover V2 🚨 🔥🔥🔥 The strongest open-source theorem prover to date. 🥇 #1 on PutnamBench: Solves 64 problems—with far less compute. 🧠 New SOTA on MiniF2F: * 32B model hits 90.4% at Pass@32, beating DeepSeek-Prover-V2-671B’s 82.4%. * 8B > 671B: Our 8B…

3

10

54

6

6.0K

S

Seth Karten@sethkarten · Jul 15

Hello again, Vancouver

0

3

0

183

S

Seth Karten@sethkarten · Jul 14

Check out the PokeAgent Challenge for NeurIPS 2025 and consider participating!

SSeth Karten@sethkarten · Jul 14

🚀 Launch day! The NeurIPS 2025 PokéAgent Challenge is live. Two tracks: ① Showdown Battling – imperfect-info, turn-based strategy ② Pokemon Emerald Speedrunning – long horizon RPG planning 5 M labeled replays • starter kit • baselines. Bring your LLM, RL, or hybrid…

0

3

16

9

3.0K

S

Seth Karten@sethkarten · Jul 14

maybe I can finally make my quagsire-gastrodon mono-water dreams come true… really excited to see what approaches end up being successful!

SSeth Karten@sethkarten · Jul 14

🚀 Launch day! The NeurIPS 2025 PokéAgent Challenge is live. Two tracks: ① Showdown Battling – imperfect-info, turn-based strategy ② Pokemon Emerald Speedrunning – long horizon RPG planning 5 M labeled replays • starter kit • baselines. Bring your LLM, RL, or hybrid…

3

1

27

3

2.0K

S

Seth Karten@sethkarten · Jul 14

Excited to release our NeurIPS 2025 PokéAgent Challenge! Pokémon becomes a testbed for long-horizon learning and stochastic game theory. Curious to see which algorithms hold up under pressure.

SSeth Karten@sethkarten · Jul 14

🚀 Launch day! The NeurIPS 2025 PokéAgent Challenge is live. Two tracks: ① Showdown Battling – imperfect-info, turn-based strategy ② Pokemon Emerald Speedrunning – long horizon RPG planning 5 M labeled replays • starter kit • baselines. Bring your LLM, RL, or hybrid…

0

6

50

10

3.0K

S

Seth Karten@sethkarten · Jul 14

🚀 Super excited about this @NeurIPSConf challenge! 🚨 To help with training, we open-sourced 5M+ competitive Pokémon battles!!! Can't wait to see how people use the data

SSeth Karten@sethkarten · Jul 14

🚀 Launch day! The NeurIPS 2025 PokéAgent Challenge is live. Two tracks: ① Showdown Battling – imperfect-info, turn-based strategy ② Pokemon Emerald Speedrunning – long horizon RPG planning 5 M labeled replays • starter kit • baselines. Bring your LLM, RL, or hybrid…

6

9

80

22

21.0K