Arya McCarthy

@aryamccarthy

massively multilingual #NLProc • translation • @amazon fellow • jelinek fellow • PhD @jhuclsp • was @googleai, @duolingo, and @facebookai

Joined October 2011

781Following

1KFollowers

Pinned

Arya McCarthy@aryamccarthy · Oct 6, 2020

“We use the same training data as GPT-3 (Brown et al., 2020), which we will denote ‘the Brown corpus’.”

117

Arya McCarthy Retweeted

Abhilasha Ravichander@lasha_nlp · Nov 11

✨I’m on the faculty job market for 2024-2025! ✨ My research focuses on advancing Responsible AI—enhancing factuality, robustness, and transparency in AI systems. I’m at #EMNLP2024 this week🌴 and would love to chat about research and hear any advice!

224

20.0K

Arya McCarthy@aryamccarthy · May 23, 2024

Visited Claude today

889

Arya McCarthy Retweeted

Niyati Bafna@BafnaNiyati · May 20, 2024

Drop by my talk at LREC-COLING on Thursday on unsupervised cognate induction between closely related data-imbalanced language pairs :) arxiv.org/pdf/2305.14012

6.0K

Arya McCarthy Retweeted

JHU CLSP@jhuclsp · May 21, 2024

Congrats to CLSP’ers graduating this year! 🥳🥳 Photo credit: @esalesk

12.0K

Arya McCarthy@aryamccarthy · May 20, 2024

Step 0. If there’s a peer-reviewed version, cite that instead. Usually conferences and publishers put effort into BibTeX that’s more complete and beautiful than arXiv’s. (Did this for TACL with @mjpost)

DDan Roy@roydanroy · May 19, 2024

PSA on how to properly cite arXiv articles. Step 1. Use biblatex (with the biber backend (!) and natbib option, if you like your old school commands, like I do). For NeurIPS, you'll have to issue the [nonatbib] command to the neurips package. Code at end of thread.

2.0K

Arya McCarthy Retweeted

Jennifer Pan@jenjpan · May 8, 2024

New pub w/ @debutts in @Journal_Of_Comm "Reporting after Removal: Effects of Journalist Expulsion on Foreign News Coverage". What happened to coverage of China after a large portion of the reporting corps of @nytimes, @washingtonpost, @WSJ were expelled from China in 2020? 1/3

12.0K

Arya McCarthy Retweeted

Niloofar (✈️ ACL)@niloofar_mire · May 6, 2024

In🇦🇹this week @iclr_conf to present: 1⃣Privacy-preserving in-context learning with DP few-shot generation arxiv.org/abs/2309.11765 🖼️Tue 10:45 B-229 2⃣Can LLMs Keep a Secret? Testing Privacy Implications via Contextual Integrity arxiv.org/abs/2310.17884 🖼️ Wed 16:30 B-215

120

30.0K

Arya McCarthy@aryamccarthy · May 6, 2024

Some good news that I'm excited to share today: I've received an NSF CAREER award! This award will support my group's research on robust, fair, and culturally aware commonsense reasoning in natural language. Feeling grateful to my students & the many people who have supported me!

UUMD Department of Computer Science@umdcs · May 6, 2024

👏 Congrats to @rachelrudinger for receiving an @NSF CAREER award for her work on 'Robust, Fair, and Culturally Aware Commonsense Reasoning in Natural Language.' Looking forward to seeing how this research shapes the future! Read more: go.umd.edu/Rudinger-NSF-C…

160

10.0K

Arya McCarthy Retweeted

Matthew Lubbeck@matthewlubbeck · Mar 2, 2024

Until recently, I really enjoyed the Bai drinks. They recently rebranded the drink as 'wonderwater' from 'antioxidant infusion' and shrunk the bottle, and charge the same price. It tastes different to me. I don't like it, I think it's worse now.

960

Arya McCarthy@aryamccarthy · May 1, 2024

There’s always a relevant xkcd: m.xkcd.com/810/

JJason Eisner@adveisner · May 1, 2024

WHAT: "AI-Curated Democratic Discourse," a JSALT hackathon team this summer (Jun 10-Aug 2) GOAL: Redesign the social media UI to raise the quality of reading and posting, with the help of LLMs🤯 WHO: Looking for 1 more funded, in-person NLP PhD student! DM me with yr skillz.

1.0K

Arya McCarthy@aryamccarthy · May 1, 2024

Undoubtedly, it is Raffel et al (2020), the T5 paper, which came earlier. You can see the beginnings of supervised instruction/prompting there (see red highlights) and it was the first ever paper to show you can use formulate any task as a text to text task and train multiple…

SSebastian Raschka@rasbt · May 1, 2024

Stiennon et al. (2020). The first paper I know that used RLHF+PPO to finetune LLMs to follow human instructions (here: specifically summarizing texts). That was even 2 years (!) before InstructGPT. arxiv.org/abs/2009.01325

278

151

84.0K

Arya McCarthy Retweeted

Laurie Burchell@very_laurie · May 1, 2024

I know EACL was last month, but at least I did (eventually) do it😶‍🌫️ Read our paper: "Code-Switched Language Identification is Harder Than You Think" aclanthology.org/2024.eacl-long…

739

Arya McCarthy Retweeted

Rui Shu@_smileyball · Apr 24, 2024

Me: exact inference is NP hard Someone somewhere: have you tried Mamba

1.0K

Arya McCarthy@aryamccarthy · Apr 11, 2024

Lots of great libraries here, not least of which is @PatrickKidger’s Equinox, which is one of the core libraries Levanter is built on. github.com/patrick-kidger…

IIvan Zhou@ivanzhouyq · Apr 11, 2024

Levanter from @StanfordCRFM has earned recognition at #GoogleCloudNext as a popular #Jax repository to build Foundation Models 🙌 There have been many great improvements to Levanter in the past few months led by @dlwh, particularly in achieving impressive MFU numbers on both TPU…

7.0K

Arya McCarthy Retweeted

Anna@stressandvest · Apr 4, 2024

so do I tell this guy that a16z and andreessen horowitz are the same VC

661

165.0K

Arya McCarthy Retweeted

Joël Niklaus@joelniklaus · Apr 3, 2024

I am very happy to announce our new work "FLawN-T5: An Empirical Examination of Effective Instruction-Tuning Data Mixtures for Legal Reasoning"! 📜: arxiv.org/abs/2404.02127 💾: huggingface.co/datasets/lawin… 🧵👇 1/7

10.0K

Arya McCarthy Retweeted

Luca Soldaini 🎀@soldni · Mar 28, 2024

what if we take all bad names and merge them

154

55.0K

Arya McCarthy@aryamccarthy · Feb 10, 2024

great call out: 1. NN learned neither from observations nor from self-play. It learned from teacher: Non-neural program StockFish. 2. During training search was used indirectly (via StockFish), but not during actual play. Whether or not this counts as using search is debatable.

((((ل()(ل() 'yoav))))👾@yoavgo · Feb 10, 2024

It is very tempting to read into the "grandmaster chess without search" result way beyond what's actually there. I was also mislead on first read, but further reading and reflection brought me down to earth. I wrote a bit about it. gist.github.com/yoavg/8b98bbd7…

4.0K

Arya McCarthy Retweeted

Delip Rao e/σ@deliprao · Feb 7, 2024

One of the best pieces of advice I embodied from my advisor is “smell the data”. You pay for it in compute and other ways, if you don’t do it, and from my experience working with others, most don’t. That’s one of the reasons why we have overly complicated archs, objectives, and…

267

105

63.0K

Arya McCarthy@aryamccarthy · Jan 18, 2024

Glad you enjoyed it!

ZZhijing Jin@ZhijingJin · Jan 18, 2024

I really appreciate how dynamic and fun the @BlackboxNLP workshop was. Big credits to an incredibly awesome team of organizers @boknilev @najoungkim @notaphonologist @aryamccarthy @JumeletJ @hmohebbi75🎉💪! Also thankful to all the ~500 audience joining the 9 AM talk in person!

920