Lion

@dwlion

Graphic/ Motion Designer Interested in AI safety, effective communication Creating useful content to boost the AI Alignment discourse. My link below

Joined October 2022

83Following

14Followers

Lion Retweeted

Cas (Stephen Casper) @ ICML@StephenLCasper · Aug 17, 2023

LLMs can do RL via in-context learning to red-team text and image generators! RL mediated by in-context learning seems appealing because the process can be easily steered with prompting & all of the useful knowledge and inductive biases that LLMs have. arxiv.org/abs/2308.04265

7.0K

Lion@dwlion · Jul 28, 2023

Mother of all LLM jailbreaks: Automatically constructing adversarial prompts using OSS model (Vicuna) weights that work against ChatGPT, Bard, Claude, and Llama 2 Screenshots: Demo of response without/with jailbreak suffix Linked thread from lead author has details/PDF

RRiley Goodside@goodside · Jul 28, 2023

31.0K

Lion@dwlion · Jul 22, 2023

"Leading AI researchers refer to this as their Oppenheimer moment. It's a cautionary tale." -Christopher Nolan When you see Oppenheimer, remember: “If you were at the Manhattan Project in 1944, you'd have thought 1) the world would end, or 2) surely, every country would have…

MMorning Joe@Morning_Joe · Jul 19, 2023

Christopher Nolan: "Leading researchers in AI right now refer to this as their #Oppenheimer moment— It's a cautionary tale...and I take heart that they're looking to it to at least have awareness that there is accountability for those who put new technology out to the world."

450

130

301.0K

Lion Retweeted

Anthropic@AnthropicAI · Jul 21, 2023

We all need to join in a race for AI safety. In the coming weeks, Anthropic will share more specific plans concerning cybersecurity, red teaming, and responsible scaling, and we hope others will move forward swiftly as well. whitehouse.gov/briefing-room/…

363

92.0K

Lion Retweeted

Tom Everitt@tom4everitt · Jul 13, 2023

Agree, we need regulation against making agentic systems, and for providing public APIs for systems that can easily be turned into agents. And for that, we may need a clearer, more operational definition of agency alignmentforum.org/posts/Qi77Tu3e…

2.0K

Lion@dwlion · Jul 10, 2023

Drug company: I invented a drug that will solve every health problem, I’m putting it in the planet’s water supply ASAP Society: You're...WHAT Drug company: Actually, we started putting it in the water supply months ago, many people like it so far. You can’t slow it down because…

SSiméon@Simeon_Cps · Jul 6, 2023

The true state of AI development is so insane that if you tell it to someone outside tech without quoting authorities they think you're lying.

468

122

236.0K

Lion Retweeted

The Computational Democracy Project@compdem · Jun 22, 2023

Paper release! We’re pleased to announce the release of “Opportunities and Risks of LLMs for Scalable Deliberation with Polis”, the result of six months of collaboration with @AnthropicAI to test hypotheses. Results and discussion follow in this 🧵 arxiv.org/abs/2306.11932

125

108.0K

Lion@dwlion · Jun 22, 2023

We collaborated with @compdem to research the opportunities and risks of augmenting the Pol.is platform with language models (LMs) to facilitate open and constructive dialogue between people with diverse viewpoints.

TThe Computational Democracy Project@compdem · Jun 22, 2023

132

74.0K

Lion@dwlion · Jul 2, 2023

hai.stanford.edu/news/ethics-te…

dwlion's tweet card. Because tech industry ethics teams lack resources and authority, their effectiveness is spotty at best, according to a new study.

Lion@dwlion · Jun 20, 2023

Good stuff

LLos Angeles Times@latimes · Jun 20, 2023

Biden to meet with experts about the dangers of AI on visit to San Francisco latimes.com/california/sto…

1.0K

Lion Retweeted

Center for AI Safety@ai_risks · Jun 13, 2023

In this week’s AI Safety Newsletter, we discuss: - How AI could enable bioterrorism - Britain’s global summit on AI - The letter to Meta AI from Senators Hawley and Blumenthal buff.ly/3N7eBaz (🧵here)

2.0K

Lion@dwlion · Jun 12, 2023

A collection of the discussions about AI and AI safety. Worth consideration.

MMax Tegmark@tegmark · Jun 11, 2023

Dagan Shani has IMHO made the most important film of the year - about the harsh #AI truth - see it right here on Twitter:

Lion@dwlion · Jun 11, 2023

This is a pretty cool survey and consensus tool - reminds me of the open letter just recently signed on AI. viewpoints.xyz/polls/7vzdfwfd…

Lion@dwlion · Jun 10, 2023

I will be attending my first @FAccTConference to present our work! Feel free to reach out to chat about anything! Very interested in conversations around anticipating harms and asserting more public control over AI development.

AAlan Chan@_achan96_ · May 11, 2023

1/ Are AI systems “just another tool”? Not when we give them agency! Our new @FAccTConference paper analyzes how distinctive harms might arise as we increase AI agency along 4 axes: (1) Underspecification (2) Directness of impact (3) Goal-directedness (4) Long-term planning

1.0K

Lion@dwlion · Jun 9, 2023

AgentForge channel is live! Helped set up the branding and marketing - we're working on building out an MVP based on the original hackathon entry! Stay tuned for more

AAgentForge@AgentForge · Jun 9, 2023

Inspired by Anthropic's Constitutional AI and David Shapiro's Heuristic Imperatives, we believe ETHOS represents a first step towards making autonomous agents safer to use. Finalist in the Lablab.ai Autonomous Agents Hackathon Watch here! youtu.be/SL7f6WX20Ks

260

Lion Retweeted

Evan R. Murphy@evanrmurphy · Jun 7, 2023

Here's a screenshot from the part of the post where the DeepMind team summarizes what they found to be the consensus threat model from their research:

12.0K

Lion Retweeted

Center for AI Safety@ai_risks · Jun 5, 2023

Here are three policy proposals from the AI ethics community that we believe would improve safety! 🧵 safe.ai/post/three-pol…

179

37.0K

Lion Retweeted

Richard Blumenthal@SenBlumenthal · Jun 6, 2023

Meta released its advanced AI model, LLaMA, w/seemingly little consideration & safeguards against misuse—a real risk of fraud, privacy intrusions & cybercrime. Sen. Hawley & I are writing to Meta on the steps being taken to assess & prevent the abuse of LLaMA & other AI models.

342

121

319.0K

Lion Retweeted

Dan Hendrycks@DanHendrycks · May 30, 2023

We just put out a statement: “Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.” Signatories include Hinton, Bengio, Altman, Hassabis, Song, etc. safe.ai/statement-on-a… 🧵 (1/6)

110

361

1.0K

244

1.1M

Lion@dwlion · May 26, 2023

A great breakdown of the current state of AI and the major labs' approach to safety and governance. The last line in the video is salient. 'Governing Superintelligence' - Synthetic Pathogens, The Tree of Thought... youtu.be/irLn5-pTkL0 via @YouTube