Si-Qi LIU 刘思奇

@liusiqi42

Staff Research Engineer @DeepMind. AI/ML PhD candidate @UCL. Multiagent RL, Game Theory, Evaluation.

London

Joined April 2013

285Following

480Followers

Pinned

Si-Qi LIU 刘思奇@liusiqi42 · Jun 5

Yet scale it we must!

BBalaji@balajis · Jun 4

AI PROMPTING → AI VERIFYING AI prompting scales, because prompting is just typing. But AI verifying doesn’t scale, because verifying AI output involves much more than just typing. Sometimes you can verify by eye, which is why AI is great for frontend, images, and video. But…

118

Pinned

Si-Qi LIU 刘思奇@liusiqi42 · Feb 2

Agreed. Unless you have a horse in the race, "where should AGI be built" is just irrelevant. The real question is how can we (continue to) incentivise open models at frontier performance - gatekeeping takes away competition and makes open models harder to justify for incumbents.

TThomas Wolf@Thom_Wolf · Jan 30

Finally took time to go over Dario's essay on DeepSeek and export control and to be honest it was quite painful to read. And I say this as a great admirer of Anthropic and big user of Claude* The first half of the essay reads like a lengthy attempt to justify that closed-source…

198

Pinned

Si-Qi LIU 刘思奇@liusiqi42 · Dec 15

Sweeping generalisation is always the easy way out. It takes courage and independence to resist. Parroting what's reported, of one bad apple, and generalising it to all Chinese on stage at @NeurIPSConf is extremely disappointing. I hope this is an exception.

JJiao Sun@sunjiao123sun_ · Dec 14

Mitigating racial bias from LLMs is a lot easier than removing it from humans! Can’t believe this happened at the best AI conference @NeurIPSConf We have ethical reviews for authors, but missed it for invited speakers? 😡

354

Pinned

Si-Qi LIU 刘思奇@liusiqi42 · Oct 25

Very true! When I was an AdWords intern in 2015, I cold e-mailed @sirbayes hoping to work on im2calories just because it was such a cool idea! That specific project didn't continue but still I was given a chance to learn how to do research on his team. What a ride it has been!

JJakob Foerster@j_foerst · Oct 23

Cold emails are hard and good ones can change a life. Here is my email to @NandoDF that started my career in ML (at the time I was a PM at Google) docs.google.com/document/d/1_u… Real effort (incl feedback) went into drafting it. Thanks to @EugeneVinitsky for nudging me to put it online

656

Si-Qi LIU 刘思奇@liusiqi42 · Jun 21

Haha many late nights launching those policy gradients runs only to wake up seeing those infamous wobbly curves! Little did I know the recipe would go on to scale so well!

KKevin Patrick Murphy@sirbayes · Jun 21

My colleague and former intern @liusiqi42 reminded me that we did RLFT for LMs almost 10 years ago - back then it was for an img2text model based on CNNs and RNNs. But same basic recipe - pre train with MLE then fine tune with PG. arxiv.org/abs/1612.00370

1.0K

Si-Qi LIU 刘思奇@liusiqi42 · Jun 5

Majestic!

llmarena.ai@lmarena_ai · Jun 5

🚨Breaking: New Gemini-2.5-Pro (06-05) takes the #1 spot across all Arenas again! 🥇 #1 in Text, Vision, WebDev 🥇 #1 in Hard, Coding, Math, Creative, Multi-turn, Instruction Following, and Long Queries categories Huge congrats @GoogleDeepMind!

353

Si-Qi LIU 刘思奇 Retweeted

Google DeepMind@GoogleDeepMind · Mar 25

Think you know Gemini? 🤔 Think again. Meet Gemini 2.5: our most intelligent model 💡 The first release is Pro Experimental, which is state-of-the-art across many benchmarks - meaning it can handle complex problems and give more accurate responses. Try it now →…

518

3.0K

442

1.1M

Si-Qi LIU 刘思奇@liusiqi42 · Dec 11

👀

llmarena.ai@lmarena_ai · Dec 11

Breaking News from Chatbot Arena⚡ @GoogleDeepMind Gemini-2.0-Flash debuts at #3 Overall - a massive leap from Flash-002! Highlights (improvement from Flash-002): - Overall: #11 → #3 - Hard Prompts: #15 → #2 - Coding: #22 → #3 - Longer query: #8 → #1 - Overall…

152

Si-Qi LIU 刘思奇 Retweeted

Google DeepMind@GoogleDeepMind · Jul 25, 2024

We’re presenting the first AI to solve International Mathematical Olympiad problems at a silver medalist level.🥈 It combines AlphaProof, a new breakthrough model for formal reasoning, and AlphaGeometry 2, an improved version of our previous system. 🧵 dpmd.ai/imo-silver

291

1.0K

5.0K

1.0K

2.0M

Si-Qi LIU 刘思奇@liusiqi42 · Jun 19, 2024

Oh boy we need that colab notebook real bad!

YYann LeCun@ylecun · Jun 18, 2024

Why has productivity (GDP per hour worked) grown faster in the US than in Europe over the last 15 years. [note: if you think this is because Americans work more than Europeans, you are wrong. Productivity measures output *per hour worked*]

255

Si-Qi LIU 刘思奇@liusiqi42 · May 14, 2024

Gemini and I also got a chance to watch the @OpenAI live announcement of gpt4o, using Project Astra! Congrats to the OpenAI team, super impressive work!

MMichael Chang@mmmbchang · May 14, 2024

It's such an honor to work on Project Astra with such an amazing team from across Gemini and Google DeepMind! While the #GoogleIO keynote was happening we had a last minute idea of watching the keynote with Project Astra. Check it out!

243

1.0K

260

709.0K

Si-Qi LIU 刘思奇 Retweeted

Multi Agent Learning Seminar@MALSeminar · Apr 30, 2024

This Friday, we will have the honour of hosting @liusiqi42 from @GoogleDeepMind and @ucl presenting “NfgTransformer: Equivariant Representation Learning of Normal-form Games”. See y’all there!

1.0K

Si-Qi LIU 刘思奇@liusiqi42 · Apr 15, 2024

👀

ZZhengyi “Zen” Luo@zhengyiluo · Apr 12, 2024

Crazy results🔥 On the other hand, all the mocap cameras and markers keep reminding me that it’s not yet possible to get this to work onboard with egocentric vision & sensors 😢 Long way to go💪

304

Si-Qi LIU 刘思奇@liusiqi42 · Mar 6, 2024

Have a view, make it known and stand by them in public. Kudo to the team for setting an example for the industry!

AAmanda Askell@AmandaAskell · Mar 6, 2024

Here is Claude 3's system prompt! Let me break it down 🧵

342