Christina Knight

@cqknight_

AI safety x China policy

San Francisco, CA

Joined September 2016

107Following

90Followers

Pinned

Christina Knight Retweeted

Miles Turpin@milesaturpin · Jul 14

New @Scale_AI paper! 🌟 LLMs trained with RL can exploit reward hacks but not mention this in their CoT. We introduce verbalization fine-tuning (VFT)—teaching models to say when they're reward hacking—dramatically reducing the rate of undetected hacks (6% vs. baseline of 88%).

280

136

23.0K

Pinned

Christina Knight Retweeted

AG Andrea Joy Campbell@MassAGO · Jul 17, 2023

This relief is critical for these student loan borrowers—all of whom have struggled for at least twenty years to pay back their loans. I applaud the Biden-Harris Administration and @SecCardona for helping these borrowers receive the loan forgiveness that they deserve.

1.0K

Christina Knight Retweeted

Scale AI@scale_AI · Jun 26

Introducing FORTRESS. Our newest benchmark built to evaluate AI models where it matters most: national security and public safety.

16.0K

Christina Knight@cqknight_ · Jun 24

A new adversarial robustness & over-refusal benchmark, FORTRESS, is launched in SEAL Leaderboards at @scale_AI. Ranks are sorted by the average risk score (ARS, the lower ➡️ the better) of model responses to harmful user requests. 🥇: Claude 3.5 Sonnet (w/ high over refusal…

CChristina Knight@cqknight_ · Jun 24

🧵 (1/5) Powerful LLMs present dual-use opportunities & risks for national security and public safety (NSPS). We are excited to launch FORTRESS, a new SEAL leaderboard for measuring adversarial robustness of model safeguard and over-refusal tailored particularly for NSPS threats.

686

Christina Knight Retweeted

Lawfare@lawfare · Apr 2

"If the Trump administration is committed to leading the world in AI...the new administration needs to focus on deep collaboration with allies to shape their regulation, promote U.S. open source technology, and counter China’s AI influence," writes @cqknight_.

4.0K

Christina Knight@cqknight_ · May 12, 2024

Highly recommend @YashengHuang's new book, “The Rise and Fall of the EAST.” Huang's analysis of stability, innovation, and diversity throughout China's evolution is especially pertinent to current discussions on PRC AI capabilities. Read my review here! lawfaremedia.org/article/a-civi…

3.0K

Christina Knight@cqknight_ · Mar 23, 2024

Read below for perspectives from Chinese youth during the tumultuous, largely undocumented year in the aftermath of zero-COVID and the White Paper Protests. Honored for the opportunity to learn and share! theatlantic.com/international/…

cqknight_'s tweet card. For young people, the dissent that briefly kindled protests against pandemic lockdowns has settled into a malaise of vague discontent.

2.0K

Christina Knight Retweeted

East Asia Forum@east_asia_forum · Aug 30, 2023

China’s mature chip strategy challenges Washington's technology barricades, writes @cqknight_ (Stanford). buff.ly/47W6x6e

2.0K

Christina Knight Retweeted

Lawfare@lawfare · Aug 22, 2023

After interviewing over 60 Chinese college students, Christina Knight writes that the technology competition between the U.S. and China and higher tech investment by the CCP is leading to increased interest by young people in China's tech sector. lawfaremedia.org/article/what-d…

10.0K

Christina Knight Retweeted

Gregory C. Allen@Gregory_C_Allen · May 4, 2023

Oct 7, 2022 is destined to go echo in geopolitical history: the U.S. launched a new set of export controls targeting China's AI & semiconductor industries. In a new @CSIS report, I analyze China's strategy for striking back. Summary in THREAD csis.org/analysis/china…

140

289

158

88.0K