Deqing Fu (@DeqingFu)

Deqing Fu Retweeted

Q

Qinyuan Ye@qinyuan_ye · 16 h

1+1=3 2+2=5 3+3=? Many language models (e.g., Llama 3 8B, Mistral v0.1 7B) will answer 7. But why? We dig into the model internals, uncover a function induction mechanism, and find that it’s broadly reused when models encounter surprises during in-context learning. 🧵

3

9

64

17

4.0K

D

Deqing Fu@DeqingFu · Jun 13

We show that you can control and steer layout, style, etc in diffusion models using SAEs

BBerk Tınaz@berk_tinaz · Jun 12

I’ll be at #CVPR2025 presenting a spotlight talk at the VisCon Workshop on our latest work: "Emergence and Evolution of Interpretable Concepts in Diffusion Models" 🕒 June 12th, 3:15PM CDT A short thread summarizing the paper 🧵

1

11

1

1.0K

D

Deqing Fu@DeqingFu · Jun 13

💯

wwh@nrehiew_ · Jun 12

This result that "reasoning" features learnt by an SAEs can be transferred **as is** across MODELS and datasets is super cool and similar in spirit to Mistral's finding that there exists a low dim reasoning direction

0

3

0

405

D

Deqing Fu@DeqingFu · Jun 12

How to make SAEs useful beyond interpretability and steering? @UpupWang 's work Resa shows: 🧐SAEs can capture the reasoning features (as an interpretability tool) 🤔SAEs can further elicit strong reasoning abilities via SAE-tuning the model (stronger claim than steering, imho)

SShangshang Wang@UpupWang · Jun 12

Sparse autoencoders (SAEs) can be used to elicit strong reasoning abilities with remarkable efficiency. Using only 1 hour of training at $2 cost without any reasoning traces, we find a way to train 1.5B models via SAEs to score 43.33% Pass@1 on AIME24 and 90% Pass@1 on AMC23.

0

21

3

1.0K

Deqing Fu Retweeted

S

Shangshang Wang@UpupWang · Jun 12

Sparse autoencoders (SAEs) can be used to elicit strong reasoning abilities with remarkable efficiency. Using only 1 hour of training at $2 cost without any reasoning traces, we find a way to train 1.5B models via SAEs to score 43.33% Pass@1 on AIME24 and 90% Pass@1 on AMC23.

10

54

495

501

67.0K

Deqing Fu Retweeted

x

xiao zhang@xiaozha55937919 · Jun 11

Excited to be at @CVPR 2025! Looking forward to catching up with old friends and meeting new ones. If you're interested in grabbing coffee, trying out new restaurants, or chatting about generative representation learning, feel free to DM me! A quick summary of my recent works:…

0

2

9

2

691

Deqing Fu Retweeted

Y

Yuqing Yang@yyqcode · May 29

🧐When do LLMs admit their mistakes when they should know better? In our new paper, we define this behavior as retraction: the model indicates that its generated answer was wrong. LLMs can retract—but they rarely do.🤯 arxiv.org/abs/2505.16170 👇🧵

5

24

112

56

14.0K

D

Deqing Fu@DeqingFu · May 22

posted my slides for today's talk here: deqingfu.github.io/_docs/20250522… check it out!

DDeqing Fu@DeqingFu · May 21

I’ll be giving a talk at Stanford NLP seminar tomorrow about our recent work on multimodal LLMs.

1

5

34

6

4.0K

D

Deqing Fu@DeqingFu · May 21

I’ll be giving a talk at Stanford NLP seminar tomorrow about our recent work on multimodal LLMs.

SStanford NLP Group@stanfordnlp · May 21

For this week’s NLP Seminar, we are thrilled to host @DeqingFu to talk about Closing the Modality Gap: Benchmarking and Improving Visual Understanding in Multimodal LLMs! When: 5/22 Thurs 11am PT Non-Stanford affiliates registration form (closed at 9am PT on the talk day):…

6

8

67

14

13.0K

Deqing Fu Retweeted

S

Stanford NLP Group@stanfordnlp · May 21

For this week’s NLP Seminar, we are thrilled to host @DeqingFu to talk about Closing the Modality Gap: Benchmarking and Improving Visual Understanding in Multimodal LLMs! When: 5/22 Thurs 11am PT Non-Stanford affiliates registration form (closed at 9am PT on the talk day):…

1

9

56

13

14.0K

Deqing Fu Retweeted

D

Deqing Fu@DeqingFu · May 21

Textual steering vectors can improve visual understanding in multimodal LLMs! You can extract steering vectors via any interpretability toolkit you like -- SAEs, MeanShift, Probes -- and apply them to image or text tokens (or both) of Multimodal LLMs. And They Steer!

1

13

48

13

7.0K

D

Deqing Fu@DeqingFu · May 14

It’s a great honor to receive a best research assistant award!

UUSC Thomas Lord Department of Computer Science@CSatUSC · May 13

Congratulations to all of our @CSatUSC @USCViterbi graduate award recipients! Premankur Banerjee (PhD), Kegan Strawn (PhD), Deqing Fu (PhD) and Zhaotian Weng (MS)! 🎉🏆 @DeqingFu @USCAdvComputing @LarsLindemann2

5

2

29

0

1.0K

Deqing Fu Retweeted

U

USC Thomas Lord Department of Computer Science@CSatUSC · May 13

Congratulations to all of our @CSatUSC @USCViterbi graduate award recipients! Premankur Banerjee (PhD), Kegan Strawn (PhD), Deqing Fu (PhD) and Zhaotian Weng (MS)! 🎉🏆 @DeqingFu @USCAdvComputing @LarsLindemann2

0

4

20

0

5.0K

D

Deqing Fu@DeqingFu · May 1

#NAACL2025 Checkout DreamSync's poster tomorrow (May 1) at Hall 3, 4:00-5:30pm. Feel free to stop by to chat about multimodality, evaluation, and interpretability. We are also planning an interpretability lunch tomorrow. Find it on whova and join!

DeqingFu's tweet image. #NAACL2025 Checkout DreamSync's poster tomorrow (May 1) at Hall 3, 4:00-5:30pm. Feel free to stop by to chat about multimodality, evaluation, and interpretability.

We are also planning an interpretability lunch tomorrow. Find it on whova and join!

0

4

21

0

19.0K