Core Francisco Park

@corefpark

@Harvard. Working on: Multi Agents, AI AI Researcher

Cambridge, MA, USA

Joined May 2023

697Following

1KFollowers

Pinned

🚨 New Paper! A lot happens in the world every day—how can we update LLMs with belief-changing news? We introduce a new dataset "New News" and systematically study knowledge integration via System-2 Fine-Tuning (Sys2-FT). 1/n

corefpark's tweet image. 🚨 New Paper!

A lot happens in the world every day—how can we update LLMs with belief-changing news?

We introduce a new dataset "New News" and systematically study knowledge integration via System-2 Fine-Tuning (Sys2-FT).

1/n

248

224

35.0K

Core Francisco Park Retweeted

K Srinivas Rao@sriniously · Jul 13

The real developer moat isn't coding anymore. LLMs can pump out functions faster than most of us can type. The moat is in the spaces between the code. It's knowing why your database is slow when the logs show nothing obvious. It's understanding that the "simple" feature request…

470

274

29.0K

Core Francisco Park@corefpark · Jul 13

- 8000 USD / Mtok - Input: 10 tok/s - Output: 2 tok/s - Latency: 10 mins ~ 2 weeks - 12h downtime per day Integrating this agent into a multi agent system is challenging......

259

Core Francisco Park@corefpark · Jul 9

100%

KKevin Lu@_kevinlu · Jul 9

I really like this diagram from @_jasonwei and @hwchung27 about how to view the bitter lesson: It's a mistake not to add structure now, it's a mistake to not remove that structure later. We're at the precipice of setting up a huge, powerful RL training run that will define the…

167

Core Francisco Park Retweeted

Rylan Schaeffer@RylanSchaeffer · Jul 3

New position paper! Machine Learning Conferences Should Establish a “Refutations and Critiques” Track Joint w/ @sanmikoyejo @JoshuaK92829 @yegordb @bremen79 @koustuvsinha @in4dmatics @JesseDodge @suchenzang @BrandoHablando @MGerstgrasser @is_h_a @ObbadElyas 1/6

404

130

62.0K

Core Francisco Park Retweeted

Michael Albergo@msalbergo · Jul 2

Dear @NeurIPSConf -- it seems OpenReview is down entirely, and we cannot submit reviews for the upcoming review deadline tonight. Please share if you are having a similar issue. #neurips2025

9.0K

Core Francisco Park Retweeted

Fenil Doshi@fenildoshi009 · Jul 2

🧵 What if two images have the same local parts but represent different global shapes purely through part arrangement? Humans can spot the difference instantly! The question is can vision models do the same? 1/15

114

589

420

60.0K

Core Francisco Park@corefpark · Jun 30

Very cool experiment

CCLS@ChengleiSi · Jun 30

Are AI scientists already better than human researchers? We recruited 43 PhD students to spend 3 months executing research ideas proposed by an LLM agent vs human experts. Main finding: LLM ideas result in worse projects than human ideas.

278

Core Francisco Park@corefpark · Jun 28

Amazing! I was wondering why there is no good curated dataset on humans playing game of 24. Here it is now :)

DDaniel Wurgaft@danielwurgaft · Jun 26

Can we record and study human chains of thought? The think-aloud method, where participants voice their thoughts as they solve a task, offers a way! In our #CogSci2025 paper co-led with Ben Prystawski, we introduce a method to automate analysis of human reasoning traces! (1/8)🧵

583

Core Francisco Park Retweeted

Daniel Wurgaft@danielwurgaft · Jun 28

🚨New paper! We know models learn distinct in-context learning strategies, but *why*? Why generalize instead of memorize to lower loss? And why is generalization transient? Our work explains this & *predicts Transformer behavior throughout training* without its weights! 🧵 1/

6.0K

Core Francisco Park@corefpark · Jun 26

Waiting for indexing of github... :)

HHarvey ∇@yorkeccak · Jun 11

We indexed 100% of ArXiv for full-text multimodal retrieval so you don’t have to. >search over actual content not just abstracts >natively multimodal returning figures/equations/tables >get academic metadata (citation strings, references, and more) Happy building 🛠️

312

Core Francisco Park Retweeted

Puneesh Deora@puneeshdeora · Jun 25

🚨 New paper drop! 🚨 🤔 When a transformer sees a sequence that could be explained by many rules, which rule does it pick? It chooses the simplest sufficient one! 🧵👇

352

355

33.0K

Core Francisco Park@corefpark · Jun 22

Cursor is down....

368

Core Francisco Park@corefpark · Jun 18

This sounds like a great idea, maybe even allow a K-budget

ssacha🥝@alexUnder_sky · Jun 18

Or fix K to be reasonable for each task (assuming that each effort takes time t). Pass@1024 is just ridiculous for competitive programming ig

310

Core Francisco Park Retweeted

Kento Nishi｜AI Researcher, LiveTL+HyperChat Dev🐔@kento_nishi · Jun 9

🚨 ICML 2025 Paper! 🚨 Excited to announce "Representation Shattering in Transformers: A Synthetic Study with Knowledge Editing." 🔗 arxiv.org/abs/2410.17194 We uncover a new phenomenon, Representation Shattering, to explain why KE edits negatively affect LLMs' reasoning. 🧵👇

224

138

32.0K

Core Francisco Park@corefpark · Jun 6

When a system is complex, trying to understand a system as a set of sparse causal relations is quite hard! Especially in deep learning, its more like there is a regime where a claim applies and its more important to identify that regime!

SSimo Ryu@cloneofsimo · Jun 6

What does it mean to say "method A works because of reason B" ? In sense of causal inference, if one can intervene the effect of B, and it turns out B causes A, its safe to say "A works well because of B". however, in analysis of ML methods its nearly impossible to do this…

604

Core Francisco Park Retweeted

Vaishnavh Nagarajan@_vaishnavh · Jun 2

📢 New paper on creativity & multi-token prediction! We design minimal open-ended tasks to argue: → LLMs are limited in creativity since they learn to predict the next token → creativity can be improved via multi-token learning & injecting noise ("seed-conditioning" 🌱) 1/ 🧵

165

112

27.0K

Core Francisco Park@corefpark · May 31

Fluent in AI is different than fluent in other languages in this sense: Only those who are fluent in AI understands what it means to be fluent in AI.

WWade Foster@wadefoster · May 30

We’re setting a new standard at Zapier. 100% of new hires must be fluent in AI.

483