Eugene Bagdasarian
@ebagdasa
Challenge AI security and privacy practices. Asst Prof at UMass @manningcics. Researcher at @GoogleAI. he/him ๐ฆ๐ฒ (opinions mine)
Nerd sniping is probably the coolest description of this phenomena ( @woj_zaremba et al described it recently), but in our case overthinking didn't lead to any drastic consequences besides higher costs.
Ha! You can nerdsnipe reasoning models with decoy problems to make them overthink and slow them down/make them more expensive to run. arxiv.org/abs/2502.02542
How Sudokus can waste your money? If you are using reasoning LLMs with public data, adversaries could pollute it with nonsense (but perfectly safe!) tasks that will slow down reasoning and amplify overheads ๐ฐ (as you pay but not see reasoning tokens) while keeping answers intact
๐ง ๐ธ "We made reasoning models overthink โ and it's costing them big time." Meet ๐คฏ #OVERTHINK ๐คฏ โ our new attack that forces reasoning LLMs to "overthink," slowing models like OpenAI's o1, o3-mini & DeepSeek-R1 by up to 46ร by amplifying number of reasoning tokens. ๐ ๏ธ Keyโฆ
Filtering names w LLMs is easy, right? Plenty of privacy solutions out there claiming how well things work. However, our paper led by @dzungvietpham shows that things get tricky once we go to rare names in ambiguous contexts -- which could result in real harm if overlooked.
๐ Can LLMs reliably detect PII such as person names? โผ๏ธ Not really, especially if the context has ambiguity. ๐๏ธ Our work shows that LLMs can struggle to recognize person names in barely ambiguous contexts.
Thanks @niloofar_mire for moderating the session ๐! Thanks @EarlenceF, @jhasomesh , @christodorescu for organizing this awesome SAGAI workshop (and also inviting me, haha)!
Join us at the SAGAI workshop @IEEESSP, @ebagdasa is talking about contextual integrity and security for AI agents!
Our @IEEESSP SAGAI workshop on systems-oriented security for AI agents has speaker details (abs/bio) on the website now: sites.google.com/ucsd.edu/sagaiโฆ We look forward to seeing you in San Francisco on May 15! As a reminder, we are running this "Dagstuhl" style - real discussions.
I am looking for a postdoc to work on multi-agent safety problems, if you are interested or know anyone let me know: forms.gle/NFuYLKj53fVwdWโฆ
Amazing forward-looking paper on how collaboration could be done where you and I have different perspectives.
Suppose you and I both have different features about the same instance. Maybe I have CT scans and you have physician notes. We'd like to collaborate to make predictions that are more accurate than possible from either feature set alone, while only having to train on our own data.
The Privacy Preserving AI workshop is back! And is happening on Monday. I am excited about our program and lineup of invited speakers! I hope to see many of you there: ppai-workshop.github.io
(1/n) In our #ICLR2025 paper, we explore a fundamental issue that enables prompt injections: ๐๐๐๐ฌโ ๐ข๐ง๐๐๐ข๐ฅ๐ข๐ญ๐ฒ ๐ญ๐จ ๐ฌ๐๐ฉ๐๐ซ๐๐ญ๐ ๐ข๐ง๐ฌ๐ญ๐ซ๐ฎ๐๐ญ๐ข๐จ๐ง๐ฌ ๐๐ซ๐จ๐ฆ ๐๐๐ญ๐ ๐ข๐ง ๐ญ๐ก๐๐ข๐ซ ๐ข๐ง๐ฉ๐ฎ๐ญ โ Definition of separation ๐ SEP Benchmark ๐ LLM evals on SEP
Amazing opportunity to do ground breaking work in LLMs!
We now have a form for postdoc applications: forms.gle/tiydAChgV1wLcQโฆ I am looking at candidates on a rolling basis, so while there's no deadline, there's an advantage of throwing your name in the ring earlier than later