Shruti Joshi

@_shruti_joshi_

phd student in identifiable repl @Mila_Quebec. prev. research programmer @MPI_IS Tübingen, undergrad @IITKanpur '19.

Montreal, Canada

Joined August 2018

830Following

403Followers

Pinned

Shruti Joshi@_shruti_joshi_ · Feb 21

1\ Hi, can I get an unsupervised sparse autoencoder for steering, please? I only have unlabeled data varying across multiple unknown concepts. Oh, and make sure it learns the same features each time! Yes! A freshly brewed Sparse Shift Autoencoder (SSAE) coming right up. 🧶

$_shruti_joshi_'s tweet image. 1\ Hi, can I get an unsupervised sparse autoencoder for steering, please? I only have unlabeled data varying across multiple unknown concepts. Oh, and make sure it learns the same features each time! Yes! A freshly brewed Sparse Shift Autoencoder (SSAE) coming right up. 🧶$

16.0K

Pinned

Shruti Joshi@_shruti_joshi_ · Sep 24

I am thrilled to announce that I will be joining the Gatsby Computational Neuroscience Unit at UCL as a Lecturer (Assistant Professor) in Feb 2025! Looking forward to working with the exceptional talent at @GatsbyUCL on cutting-edge problems in deep learning and causality.

GGatsby Computational Neuroscience Unit@GatsbyUCL · Sep 24

We are delighted to announce that Dr Leena Chennuru Vankadara will join the Unit as Lecturer in Feb 2025, developing theoretical understandings of scaling and generalization in deep learning and causality. Welcome aboard @leenaCvankadara! Learn more at ucl.ac.uk/gatsby/news-an…

10.0K

Shruti Joshi Retweeted

Sahil Verma@Sahil1V · Jun 2

🚨 New Paper! 🚨 Guard models slow, language-specific, and modality-limited? Meet OmniGuard that detects harmful prompts across multiple languages & modalities all using one approach with SOTA performance in all 3 modalities!! while being 120X faster 🚀 arxiv.org/abs/2505.23856

8.0K

Shruti Joshi Retweeted

Soumye Singhal@soumyesinghal · Apr 8

⚡⚡ Llama-Nemotron-Ultra-253B just dropped: our most advanced open reasoning model 🧵👇

3.0K

Shruti Joshi@_shruti_joshi_ · Apr 1

𝐓𝐡𝐨𝐮𝐠𝐡𝐭𝐨𝐥𝐨𝐠𝐲 paper is out! 🔥🐋 We study the reasoning chains of DeepSeek-R1 across a variety of tasks and settings and find several surprising and interesting phenomena! Incredible effort by the entire team! 🌐: mcgill-nlp.github.io/thoughtology/

SSara Vera Marjanović@saraveramarjano · Apr 1

Models like DeepSeek-R1 🐋 mark a fundamental shift in how LLMs approach complex problems. In our preprint on R1 Thoughtology, we study R1’s reasoning chains across a variety of tasks; investigating its capabilities, limitations, and behaviour. 🔗: mcgill-nlp.github.io/thoughtology/

3.0K

Shruti Joshi Retweeted

Arkil Patel@arkil_patel · Feb 21

Presenting ✨ 𝐂𝐇𝐀𝐒𝐄: 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐧𝐠 𝐜𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐢𝐧𝐠 𝐬𝐲𝐧𝐭𝐡𝐞𝐭𝐢𝐜 𝐝𝐚𝐭𝐚 𝐟𝐨𝐫 𝐞𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧 ✨ Work w/ fantastic advisors @DBahdanau and @sivareddyg Thread 🧵:

7.0K

Shruti Joshi Retweeted

Sahil Verma@Sahil1V · Oct 22

📣 📣 📣 Our new paper investigates the question of how many images 🖼️ of a concept are required by a diffusion model 🤖 to imitate it. This question is critical for understanding and mitigating the copyright and privacy infringements of these models! arxiv.org/abs/2410.15002

226

109

38.0K

Shruti Joshi@_shruti_joshi_ · Oct 21

🚨NEW PAPER OUT 🚨 Excited to share our latest research initiative on in-context learning and meta-learning through the lens of Information theory !🧠 🔗 arxiv.org/abs/2410.14086 Check out our insights and empirical experiments! 🔍

EEric Elmoznino@EricElmoznino · Oct 21

Introducing our new paper explaining in-context learning through the lens of Occam’s razor, giving a normative account of next-token prediction objectives. This was with @Tom__Marty @tejaskasetty @le0gagn0n @sarthmit @MahanFathi @dhanya_sridhar @g_lajoie_ arxiv.org/abs/2410.14086

654

Shruti Joshi Retweeted

Arkil Patel@arkil_patel · Jun 16, 2024

Presenting tomorrow at #NAACL2024: 𝐶𝑎𝑛 𝐿𝐿𝑀𝑠 𝑖𝑛-𝑐𝑜𝑛𝑡𝑒𝑥𝑡 𝑙𝑒𝑎𝑟𝑛 𝑡𝑜 𝑢𝑠𝑒 𝑛𝑒𝑤 𝑝𝑟𝑜𝑔𝑟𝑎𝑚𝑚𝑖𝑛𝑔 𝑙𝑖𝑏𝑟𝑎𝑟𝑖𝑒𝑠 𝑎𝑛𝑑 𝑙𝑎𝑛𝑔𝑢𝑎𝑔𝑒𝑠? 𝑌𝑒𝑠. 𝐾𝑖𝑛𝑑 𝑜𝑓. Internship @allen_ai work with @pdasigi and my advisors @DBahdanau and @sivareddyg.

9.0K

Shruti Joshi Retweeted

Nicholas Meade@ncmeade · Apr 25, 2024

Adversarial Triggers For LLMs Are 𝗡𝗢𝗧 𝗨𝗻𝗶𝘃𝗲𝗿𝘀𝗮𝗹!😲 It is believed that adversarial triggers that jailbreak a model transfer universally to other models. But we show triggers don't reliably transfer, especially to RLHF/DPO models. Paper: arxiv.org/abs/2404.16020

33.0K

Shruti Joshi@_shruti_joshi_ · Apr 25, 2024

📢 Exciting new work on AI safety! Do adversarial triggers transfer universally across models (as has been claimed)? 𝗡𝗼. Are models aligned by supervised fine-tuning safe against adversarial triggers? 𝗡𝗼. RLHF and DPO are far better!

NNicholas Meade@ncmeade · Apr 25, 2024

2.0K

Shruti Joshi Retweeted

Arkil Patel@arkil_patel · Dec 7, 2023

Presenting tomorrow at #EMNLP2023: MAGNIFICo: Evaluating the In-Context Learning Ability of Large Language Models to Generalize to Novel Interpretations w/ amazing advisors and collaborators @DBahdanau, @sivareddyg, and @satwik1729

7.0K

Shruti Joshi Retweeted

Sébastien Lachapelle@seblachap · Dec 5, 2023

1/ Excited for our oral presentation at #NeurIPS2023 on "Additive Decoders for Latent Variables Identification and Cartesian-Product Extrapolation"! A theoretical paper about object-centric representation learning (OCRL), disentanglement & extrapolation arxiv.org/abs/2307.02598

106

12.0K