Arnav Arora
@rnav_arora
CS PhD student @CopeNLU Research Intern @Apple Low res NLP + CSS. Misinformation, media framing, Culture, LLMs Formerly @checkstep @rnv.bsky.social he/him.
It was a joy to work w\ @meriembeloucif & @SaifMMohammad on our position paper "Building Better: Avoiding Pitfalls in Developing Language Resources when Data is Scarce" & happy to see it accepted at #ACL2025NLP (main) arxiv.org/abs/2410.12691 Many thanks to our survey respondents!
New #NLProc preprint on how to build better datasets/tools for mid- to low-resource lgges while respecting the labor of the data workers: arxiv.org/abs/2410.12691 We examine 1) *why* people work on mid- to low-resource languages and 2) whether they get properly credited.🧵1/
I do (work on detecting opinion from text) and this is a very slippery slope. AI is dual use, authoritarian regimes can similarly justify finding all people speaking up against them and putting them in jail.
I don't even work on the kind of AI that they are talking about here, but shit like this makes me seriously consider quitting the whole field for good. It's not the fun, curiosity driven, open field I joined 15 years ago. It's all much darker now.
It is critical for scientific integrity that we trust our measure of progress. The @lmarena_ai has become the go-to evaluation for AI progress. Our release today demonstrates the difficulty in maintaining fair evaluations on @lmarena_ai, despite best intentions.
In "Investigating Human Values in Online Communities", we perform a high-scale study of the unique values expressed by online communities arxiv.org/abs/2402.14177 #NAACL2025 #NLProc @NadavBorenstein @rnav_arora @frimelle @IAugenstein x.com/NadavBorenstei…
Ever wondered which subreddit is the most benevolent? In a new paper (preprint: arxiv.org/abs/2402.14177), @rnav_arora, @frimelle , @IAugenstein and @I annotated 6M posts across 10k subreddits with Schwartz values.
Disconcerting AI 'experiments' in r/changemyview from Uni of Zurich, which should have never passed IRB.
Happy to share that I've joined the Apple Machine Learning Research team in Copenhagen as a research intern! Will continue to build on topics from my PhD, equitably advancing LLM access for all, working with @maartjeterhoeve and @natschluter

And thoughtology is now on Arxiv! Read more about R1 reasoning 🐋💭 across visual, cultural and psycholinguistic tasks at the link below: 🔗 arxiv.org/abs/2504.07128
Models like DeepSeek-R1 🐋 mark a fundamental shift in how LLMs approach complex problems. In our preprint on R1 Thoughtology, we study R1’s reasoning chains across a variety of tasks; investigating its capabilities, limitations, and behaviour. 🔗: mcgill-nlp.github.io/thoughtology/
🎓 Great opportunity for a fully funded PhD in interpretable NLP at the @DIKU_Institut & @AiCentreDK with @IAugenstein and me @CopeNLU! 📆 Application deadline: 15 May 2025 👀 Reasons to apply: copenlu.com/post/why-ucph/ 🔗 Apply here: candidate.hr-manager.net/ApplicationIni… #NLProc #XAI
Massive congrats to @NadavBorenstein for passing his PhD defence yesterday! 👏👏👏 Thanks to Anders Søgaard, @radamihalcea @RicoSennrich for serving as examiners. You can find his thesis here, which the committee characterised as "a joy to read": arxiv.org/abs/2503.07395
So it should be. That's the point of ethics: to stop you from doing something you'd like to do, but that would be wrong.
Ethics is holding back scientific innovation and progress
✨ Thrilled to share that our paper, "Evaluating Input Feature Explanations through a Unified Diagnostic Evaluation Framework," has been accepted at NAACL 2025—with an Oral presentation! 🎉 A huge thank you to my incredible supervisors @Isabelle and @atanasovapepa . #NAACL2025
📢New preprint! : arxiv.org/abs/2502.14132 Fact-checkers play a vital role in combating misinformation on social media, but have come under intense scrutiny in the current political climate.
Had great discussions @Cardiff_NLP. Thank you for hosting me @nedjmaou!
We had the pleasure to host @j6mes and @rnav_arora in our CardiffNLP seminar. Thank you for the two great talks! Topics: "Preference Alignment, with Reference Mismatch, and without Reference Models" by @j6mes and "NLP and CSS: lessons from one for the other" by @rnav_arora.
the grok 3 release made me sad. something fatalistic about falling back to bruteforce scaling — 100x more compute than R1 for a model that’s at most 10% better all that time, money, and electricity spent on a system that will be obsolete before my semester ends AI needs new…