Haoyu Zhao

@thomaszhao1998

PhD student @Princeton, Research Intern @MSFTResearch. Recently interested in theorem proving.

Joined June 2015

49Following

49Followers

Haoyu Zhao@thomaszhao1998 · Jul 15

Very proud to be a member of the Goedel team and contribute to our prover!

YYong Lin@Yong18850571 · Jul 15

(1/4)🚨 Introducing Goedel-Prover V2 🚨 🔥🔥🔥 The strongest open-source theorem prover to date. 🥇 #1 on PutnamBench: Solves 64 problems—with far less compute. 🧠 New SOTA on MiniF2F: * 32B model hits 90.4% at Pass@32, beating DeepSeek-Prover-V2-671B’s 82.4%. * 8B > 671B: Our 8B…

178

Haoyu Zhao Retweeted

Ori Press@ori_press · Jul 2

Do language models have algorithmic creativity? To find out, we built AlgoTune, a benchmark challenging agents to optimize 100+ algorithms like gzip compression, AES encryption and PCA. Frontier models struggle, finding only surface-level wins. Lots of headroom here!🧵⬇️

155

22.0K

Haoyu Zhao Retweeted

Sanjeev Arora@prfsanjeevarora · Dec 20

@QuantaMagazine featured our work on emergence of skill compositionality (and its limitations) in LLMs among the CS breakthroughs of the year. tinyurl.com/5f5jvzy5. Work was done over 2023 @GoogleDeepMind and @PrincetonPLI. Key pieces: (i) mathematical framework for…

3.0K

Haoyu Zhao Retweeted

Kaifeng Lyu@vfleaking · Mar 4, 2024

Fine-tuning can improve chatbots (e.g., Llama 2-Chat, GPT-3.5) on downstream tasks — but may unintentionally break their safety alignment. Our new paper: Adding a safety prompt is enough to largely mitigate the issue, but be cautious about when to add it! arxiv.org/abs/2402.18540

25.0K

Haoyu Zhao Retweeted

Abhishek Panigrahi@Abhishek_034 · Jul 6, 2023

@icmlconf **paper alert** Fine-tuning LLM on a task gives it new skill. Our “Skill localization” paper shows this skill lives in < 0.01% parameters — rest can be reverted to pre-trained values. 1/6 With @NSaunshi,@thomaszhao1998,@prfsanjeevarora Link: arxiv.org/abs/2302.06600

14.0K