Sara Vera Marjanović

@saraveramarjano

PhD fellow in NLP IR & XAI 🏠 @uni_copenhagen @MLSectionUCPH @CopeNLU ✈️ @Mila_Quebec @mcgill_nlp // Recreational sufferer.

Copenhagen, Denmark

Joined July 2023

166Following

266Followers

Pinned

Sara Vera Marjanović@saraveramarjano · Apr 1

Models like DeepSeek-R1 🐋 mark a fundamental shift in how LLMs approach complex problems. In our preprint on R1 Thoughtology, we study R1’s reasoning chains across a variety of tasks; investigating its capabilities, limitations, and behaviour. 🔗: mcgill-nlp.github.io/thoughtology/

saraveramarjano's tweet image. Models like DeepSeek-R1 🐋 mark a fundamental shift in how LLMs approach complex problems. In our preprint on R1 Thoughtology, we study R1’s reasoning chains across a variety of tasks; investigating its capabilities, limitations, and behaviour.
🔗: mcgill-nlp.github.io/thoughtology/

228

146

42.0K

Sara Vera Marjanović Retweeted

Marius Mosbach@mariusmosbach · Jul 22

Nice work! We observed a similar trend on certain math tasks in our work: arxiv.org/abs/2504.07128 Section 4.1 has a discussion of our findings. You might want to consider citing it :) cc @saraveramarjano @arkil_patel @sivareddyg

1.0K

Sara Vera Marjanović Retweeted

Sebastian Bordt@ICML@sbordt · Jul 8

Have you ever wondered whether a few times of data contamination really lead to benchmark overfitting?🤔 Then our latest paper about the effect of data contamination on LLM evals might be for you!🚀 "How Much Can We Forget about Data Contamination?" (accepted at #ICML2025) shows…

2.0K

Sara Vera Marjanović@saraveramarjano · Jun 13

Excited to share the results of my internship research with @AIatMeta, as part of a larger world modeling release! What subtle shortcuts are VideoLLMs taking on spatio-temporal questions? And how can we instead curate shortcut-robust examples at a large-scale? Details 👇🔬

AAI at Meta@AIatMeta · Jun 11

Our vision is for AI that uses world models to adapt in new and dynamic environments and efficiently learn new skills. We’re sharing V-JEPA 2, a new world model with state-of-the-art performance in visual understanding and prediction. V-JEPA 2 is a 1.2 billion-parameter model,…

7.0K

Sara Vera Marjanović Retweeted

Xing Han Lu@xhluca · Jun 13

"Build the web for agents, not agents for the web" This position paper argues that rather than forcing web agents to adapt to UIs designed for humans, we should develop a new interface optimized for web agents, which we call Agentic Web Interface (AWI).

195

125

22.0K

Sara Vera Marjanović@saraveramarjano · May 16

A great collab with former labmates @AntChen_ & Dongyan! Interesting cognitive limitation in LMs: strong disjunctive bias leads to poor performance on conjunctive causal inference tasks. Mirrors adult human biases—possibly a byproduct of training data priors.

AAnthony GX-Chen@AntChen_ · May 16

Language model (LM) agents are all the rage now—but they may have cognitive biases when inferring causal relationships! We eval LMs on psych task to find: - LMs struggle with certain simple causal relationships - They show biases similar to human adults (but not children) 🧵⬇️

3.0K

Sara Vera Marjanović Retweeted

Mila - Institut québécois d'IA@Mila_Quebec · May 1

Congratulations to Mila members Ada Tur, Gaurav Kamath and @sivareddyg for their SAC award at #NAACL2025! Check out Ada's talk in Session I: Oral/Poster 6. Paper: arxiv.org/abs/2502.05670

2.0K

Sara Vera Marjanović@saraveramarjano · Apr 29

In "Investigating Human Values in Online Communities", we perform a high-scale study of the unique values expressed by online communities arxiv.org/abs/2402.14177 #NAACL2025 #NLProc @NadavBorenstein @rnav_arora @frimelle @IAugenstein x.com/NadavBorenstei…

NNadav Borenstein@NadavBorenstein · Mar 5, 2024

Ever wondered which subreddit is the most benevolent? In a new paper (preprint: arxiv.org/abs/2402.14177), @rnav_arora, @frimelle , @IAugenstein and @I annotated 6M posts across 10k subreddits with Schwartz values.

1.0K

Sara Vera Marjanović Retweeted

Xing Han Lu@xhluca · Apr 15

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories We are releasing the first benchmark to evaluate how well automatic evaluators, such as LLM judges, can evaluate web agent trajectories. We find that rule-based evals underreport success rates, and…

227

149

33.0K

Sara Vera Marjanović Retweeted

DailyPapers@HuggingPapers · Apr 12

DeepSeek-R1 marks a shift in LLM reasoning. But what happens when we dive deep into its "thoughts"? 🤔 A new "Thoughtology" paper explores DeepSeek-R1's reasoning chains, capabilities, limitations & even its safety concerns.

12.0K

Sara Vera Marjanović@saraveramarjano · Apr 12

DeepSeek-R1 Thoughtology now #2 on @huggingface daily papers Thanks for building this great platform for sharing new papers @_akhaliq

XXing Han Lu@xhluca · Apr 11

DeepSeek-R1 Thoughtology: Let’s <think> about LLM reasoning 142-page report diving into the reasoning chains of R1. It spans 9 unique axes: safety, world modeling, faithfulness, long context, etc.

112

23.0K

Sara Vera Marjanović Retweeted

马

马东锡 NLP@dongxi_nlp · Apr 12

「DeepSeek, Reasoning」论文 DeepSeek-R1 Thoughtology: Let’s <think> about LLM reasoning 一篇非常出色的 DeepSeek-R1 分析论文，甚至为此创造了新的词汇 thoughtology（思维学）。首先，这就是 DeepSeek 开源的力量。相比 OpenAI 的 o1，DeepSeek-R1…

7.0K

Sara Vera Marjanović@saraveramarjano · Apr 11

Thoughtology is trending today on hf daily papers! Read our paper for a detailed analysis of R1’s long chains of thoughts across a variety of settings. huggingface.co/papers/2504.07…

SSara Vera Marjanović@saraveramarjano · Apr 11

And thoughtology is now on Arxiv! Read more about R1 reasoning 🐋💭 across visual, cultural and psycholinguistic tasks at the link below: 🔗 arxiv.org/abs/2504.07128

467

Sara Vera Marjanović@saraveramarjano · Apr 11

And thoughtology is now on Arxiv! Read more about R1 reasoning 🐋💭 across visual, cultural and psycholinguistic tasks at the link below: 🔗 arxiv.org/abs/2504.07128

SSara Vera Marjanović@saraveramarjano · Apr 1

3.0K

Sara Vera Marjanović Retweeted

Isabelle Augenstein@IAugenstein · Apr 8

I'm so grateful to @bcs_irsg @TechAtBloomberg for honouring me with the Karen Spärck Jones Award 🙏 I gave the award lecture on LLMs’ Utilisation of Parametric & Contextual Knowledge at #ECIR2025 today (slides: isabelleaugenstein.github.io/slides/2025_EC…) bcs.org/membership-and… #NLProc @CopeNLU

4.0K

Sara Vera Marjanović Retweeted

Amirhossein Kazemnejad @ ICML@a_kazemnejad · Apr 3

Introducing nanoAhaMoment: Karpathy-style, single file RL for LLM library (<700 lines) - super hackable - no TRL / Verl, no abstraction💆‍♂️ - Single GPU, full param tuning, 3B LLM - Efficient (R1-zero countdown < 10h) comes with a from-scratch, fully spelled out YT video [1/n]

163

1.0K

83.0K

Sara Vera Marjanović@saraveramarjano · Apr 2

Talking about "DeepSeek-R1 Thoughtology: Let’s <think> about LLM reasoning" Going live at 11am PDT (i.e., 20 mins). Last minute change of plans. You might be able to see live here: youtube.com/watch?v=aO_cTI…

SSiva Reddy@sivareddyg · Apr 1

I will be giving a talk about this work @SimonsInstitute tomorrow (Apr 2nd 3PM PT). Join us, both in-person or virtually. simons.berkeley.edu/workshops/futu…

11.0K