Sidak Pal Singh

@unregularized

Research Scientist at Google Research, working on Gemini. (prev. PhD at ETH Zürich & MPI-IS Tübingen.) No second-hand opinions. They are absolutely my own ;)

New York

Joined October 2022

94Following

468Followers

Pinned

Sidak Pal Singh@unregularized · Jul 26, 2024

📢I'll be presenting two posters, at #ICML2024 HiLD workshop (Straus 2) today (assuming no further ✈️ delays): - Closed form of the Hessian spectrum for some neural networks openreview.net/forum?id=gW30R… - Landscaping Linear Mode Connectivity openreview.net/forum?id=OSNMq…

unregularized's tweet image. 📢I'll be presenting two posters, at #ICML2024 HiLD workshop (Straus 2) today (assuming no further ✈️ delays):
- Closed form of the Hessian spectrum for some neural networks openreview.net/forum?id=gW30R…
- Landscaping Linear Mode Connectivity openreview.net/forum?id=OSNMq…

3.0K

Sidak Pal Singh@unregularized · Jul 13

when you go beyond linear mode connectivity, interesting things happen 😮👇 x.com/Theus__A/statu…

AAlexander Theus@Theus__A · Jul 9

1/ 🚨 New paper alert! 🚨 We explore a key question in deep learning: Can independently trained Transformers be linearly connected in weight space — without a loss barrier? Yes — if you uncover their rich symmetries. 📄 arXiv: arxiv.org/abs/2506.22712

423

Sidak Pal Singh@unregularized · Jun 18

Belated life update: 🎓 PhD — done 🔬 Joined Google in NYC 🗽as a Research Scientist ♊️ Gemini: now more than just my star sign :)

574

29.0K

Sidak Pal Singh@unregularized · Apr 24

🚀 TOMORROW afternoon at ICLR: Learn about the directionality of optimization trajectories in neural nets and how it inspires a potential way to make LLM pretraining more efficient ♻️ (Poster# 585, hall 2b)

SSidak Pal Singh@unregularized · Jul 12, 2024

Ever wondered how the optimization trajectories are like when training neural nets & LLMs🤔? Do they contain a lot of twists 💃 and turns, or does the direction largely remain the same🛣️? We explore this in our work for LLMs (upto 12B params) + ResNets on ImageNet. Key findings👇

2.0K

Sidak Pal Singh@unregularized · Apr 22

Don't miss out our spotlight ✨paper at ICLR 🇸🇬 about the loss landscape of Transformers and their special heterogeneous structure, done together with great collaborators! x.com/wormaniec/stat…

WWeronika Ormaniec@wormaniec · Apr 22

Ever wondered how the loss landscape of Transformers differs from that of other architectures? Or which Transformer components make its loss landscape unique? With @unregularized & @f_dangel, we explore this via the Hessian in our #ICLR2025 spotlight paper! Key insights👇 1/8

1.0K

Sidak Pal Singh Retweeted

Alice Bizeul@AliceBizeul · Feb 12

✨New Preprint ✨ Ever thought that reconstructing masked pixels for image representation learning seems sub-optimal? In our new preprint, we show how masking principal components—rather than raw pixel patches— improves Masked Image Modelling (MIM). Find out more below 🧵

529

333

48.0K

Sidak Pal Singh Retweeted

Yann N. Dauphin@ynd · Dec 10

Don’t miss our poster shedding more light on sharpness regularization at NeurIPS tomorrow neurips.cc/virtual/2024/p…

3.0K

Sidak Pal Singh@unregularized · Nov 21

Reinventing things has a bad rep in today's age. But is it really that bad? Maybe it's something to be even cultivated, like selectively? The second post in this series of blogs is now out. Let's have a deeper look at this overused trope! wovencircuits.substack.com/p/reinventing-…

unregularized's tweet image. Reinventing things has a bad rep in today's age. But is it really that bad? Maybe it's something to be even cultivated, like selectively?

The second post in this series of blogs is now out. Let's have a deeper look at this overused trope!

wovencircuits.substack.com/p/reinventing-…

273

Sidak Pal Singh@unregularized · Nov 17

I’m exploring a new form of writing—threads of human curiosity woven through the circuits of AI, crafting reflections that are, in the end, fully machine-generated, yet in a way profoundly human. wovencircuits.substack.com/p/the-spirit-b…

188

Sidak Pal Singh@unregularized · Oct 26

Come, let's scale up the building one floor, And, layer up the neural networks once more. Soon our buildings will touch the sky, And, our computers will bear AGI. A quaint little hut in the mountains is out of fashion, Satisfaction has no gradients for backpropagation. ~Fitoor

281

Sidak Pal Singh@unregularized · Oct 6

At this paper count, recalling all the paper names would already be a big feat :)

PPeter Richtarik@peter_richtarik · Oct 4

Source: papercopilot.com/paper-list/neu…

322

Sidak Pal Singh@unregularized · Oct 2

“Hypotheses are nets: only he who casts will catch.” - Novalis

299

Sidak Pal Singh@unregularized · Aug 7

At last some attempts to change the status quo: authors with three or more papers are obligated to review for ICLR x.com/PreetumNakkira…

PPreetum Nakkiran@PreetumNakkiran · Aug 7

Review requirements! (And 10pg limit!)

578

Sidak Pal Singh@unregularized · Jul 25, 2024

There goes away my Austrian flight to #ICML2024 🥲

333

Sidak Pal Singh Retweeted

Yuhui Ding@yuhui_ding · Jul 22, 2024

Tuesday 1:30pm-3pm, Hall C 4-9 #515. Drop by our poster if you are interested in SSMs for graphs👇! Code: github.com/skeletondyh/GR…

2.0K