Shayne Longpre

@ShayneRedford

Lead the Data Provenance Initiative. PhD @MIT. 🇨🇦 Prev: @Google Brain, Apple, Stanford. Interests: AI/ML/NLP, Data-centric AI, transparency & societal impact

Boston

Joined February 2015

1KFollowing

6KFollowers

Pinned

Shayne Longpre@ShayneRedford · Feb 12

I wrote a spicy piece on "AI crawler wars"🐞 in @MIT @techreview (my first op-ed)! While we’re busy watching copyright lawsuits & the EU AI Act, there’s a quieter battle over data access that affects websites, everyday users, and the open web. 🔗 technologyreview.com/2025/02/11/111… 1/

ShayneRedford's tweet image. I wrote a spicy piece on "AI crawler wars"🐞 in @MIT @techreview (my first op-ed)!

While we’re busy watching copyright lawsuits &amp; the EU AI Act, there’s a quieter battle over data access that affects websites, everyday users, and the open web.

🔗 technologyreview.com/2025/02/11/111…

1/

4.0K

Shayne Longpre@ShayneRedford · Jul 15

Excited to present our AI Flaw Disclosure paper at #ICML2025 in Vancouver!🌲🌊🏔️ Swing by our poster session in East Exhibition Halls A-B E-606!

SShayne Longpre@ShayneRedford · Mar 13

What are 3 concrete steps that can improve AI safety in 2025? 🤖⚠️ Our new paper, “In House Evaluation is Not Enough” has 3 calls-to-action to empower independent evaluators: 1️⃣ Standardized AI flaw reports 2️⃣ AI flaw disclosure programs + safe harbors. 3️⃣ A coordination…

2.0K

Shayne Longpre Retweeted

Ai2@allen_ai · Jul 9

Introducing FlexOlmo, a new paradigm for language model training that enables the co-development of AI through data collaboration. 🧵

418

185

317.0K

Shayne Longpre@ShayneRedford · Jul 9

It has been great working on the project with support from @allen_ai! I believe there are many meaningful ways different people and orgs can work together to build strong shared models, and data collaboration might be the most impactful form of it. 📄Paper:…

AAi2@allen_ai · Jul 9

Introducing FlexOlmo, a new paradigm for language model training that enables the co-development of AI through data collaboration. 🧵

193

28.0K

Shayne Longpre Retweeted

Will Knight@willknight · Jul 9

New on @WIRED: A novel type of distributed mixture-of-experts model from Ai2 (called FlexOlmo) allows data can be contributed to a frontier model confidentially, and even revoked after the model is built: wired.com/story/flexolmo…

30.0K

Shayne Longpre Retweeted

Almog@MITAlmog · Jul 9

🚨Can someone hack your ChatGPT? At MIT's Artificial Intelligence Lab, I discovered a security vulnerability in LLM providers. We reported it to @OpenAI , @AnthropicAI , @xai and others. Towards clinically useful and safe AI. Link to the paper below.

2.0K

Shayne Longpre@ShayneRedford · Jul 9

Sparse Mixture-of-Expert LLMs to opt data in & out on the fly — I think a compelling vision for a future where AI developers & publishers work together rather than filing lawsuits🙂

WWeijia Shi@WeijiaShi2 · Jul 9

Can data owners & LM developers collaborate to build a strong shared model while each retaining data control? Introducing FlexOlmo💪, a mixture-of-experts LM enabling: • Flexible training on your local data without sharing it • Flexible inference to opt in/out your data…

7.0K

Shayne Longpre@ShayneRedford · Jul 9

AAi2@allen_ai · Jul 9

Introducing FlexOlmo, a new paradigm for language model training that enables the co-development of AI through data collaboration. 🧵

269

52.0K

Shayne Longpre@ShayneRedford · Jul 9

This is a problem for AI security, safety, and transparency. AI companies should not be *threatening* or silencing good-faith research, & especially not **after its been responsibly disclosed.**

JJack Cable@jackhcable · Jul 8

Update: @cluely filed a DMCA takedown for my tweet about their system prompt, alleging that it contained "proprietary source code" Making legal threats against security researchers is not a good look, and I encourage Cluely to reflect on this and open doors to researchers. 🧵

2.0K

Shayne Longpre@ShayneRedford · Jul 8

Existing AI Agent benchmarks are broken 🤖💔 Great work by @maxYuxuanZhu and @daniel_d_kang identify + fix issues, and establish rigorous best practices for Agentic AI benchmarks! Check out the blog: ddkang.substack.com/p/ai-agent-ben…

ShayneRedford's tweet image. Existing AI Agent benchmarks are broken 🤖💔

Great work by @maxYuxuanZhu and @daniel_d_kang identify + fix issues, and establish rigorous best practices for Agentic AI benchmarks!

Check out the blog: ddkang.substack.com/p/ai-agent-ben…

114

13.0K

Shayne Longpre Retweeted

Daniel Kang@daniel_d_kang · Jul 8

As AI agents near real-world use, how do we know what they can actually do? Reliable benchmarks are critical but agentic benchmarks are broken! Example: WebArena marks "45+8 minutes" on a duration calculation task as correct (real answer: "63 minutes"). Other benchmarks…

21.0K

Shayne Longpre Retweeted

Almog@MITAlmog · Jul 8

Imagine a hacker causing ChatGPT to genuinely believe the stock market is crashing. No breach, no code, all an attacker needs is a malicious prompt and clicking 👍to change the model's weights. Our research paper demonstrates a vulnerability in LLMs RLHF mechanism:

3.0K

Shayne Longpre Retweeted

Technical AI Governance @ ICML 2025@taig_icml · Jul 8

🎙️ We are beyond proud to present the keynote speakers for the Technical AI Governance workshop at #ICML2025! Our speakers will share exciting work on AI governance from both research and policy contexts. 🧵👇

1.0K

Shayne Longpre Retweeted

jessica dai@jessicadai_ · Jul 1

individual reporting for post-deployment evals — a little manifesto (& new preprints!) tldr: end users have unique insights about how deployed systems are failing; we should figure out how to translate their experiences into formal evaluations of those systems.

134

27.0K

Shayne Longpre Retweeted

rishi@RishiBommasani · Jul 2

My PhD materials are now available! Dissertation: arxiv.org/abs/2506.23123 Slides: drive.google.com/file/d/13N2FRW… Folks should read the acknowledgements since so many people have been so important to me along this journey!

258

101

27.0K

Shayne Longpre Retweeted

Niloofar (✈️ ACL)@niloofar_mire · Jun 24

🪄We made a 1B Llama BEAT GPT-4o by... making it MORE private?! LoCoMo results: 🔓GPT-4o: 80.6% 🔐1B Llama + GPT-4o (privacy): 87.7% (+7.1!⏫) 💡How? GPT-4o provides reasoning ("If X then Y"), the local model fills in the blanks with your private data to get the answer!

188

106

18.0K