Arya McCarthy
@aryamccarthy
massively multilingual #NLProc • translation • @amazon fellow • jelinek fellow • PhD @jhuclsp • was @googleai, @duolingo, and @facebookai
“We use the same training data as GPT-3 (Brown et al., 2020), which we will denote ‘the Brown corpus’.”
✨I’m on the faculty job market for 2024-2025! ✨ My research focuses on advancing Responsible AI—enhancing factuality, robustness, and transparency in AI systems. I’m at #EMNLP2024 this week🌴 and would love to chat about research and hear any advice!
Drop by my talk at LREC-COLING on Thursday on unsupervised cognate induction between closely related data-imbalanced language pairs :) arxiv.org/pdf/2305.14012
Congrats to CLSP’ers graduating this year! 🥳🥳 Photo credit: @esalesk
Step 0. If there’s a peer-reviewed version, cite that instead. Usually conferences and publishers put effort into BibTeX that’s more complete and beautiful than arXiv’s. (Did this for TACL with @mjpost)
PSA on how to properly cite arXiv articles. Step 1. Use biblatex (with the biber backend (!) and natbib option, if you like your old school commands, like I do). For NeurIPS, you'll have to issue the [nonatbib] command to the neurips package. Code at end of thread.
New pub w/ @debutts in @Journal_Of_Comm "Reporting after Removal: Effects of Journalist Expulsion on Foreign News Coverage". What happened to coverage of China after a large portion of the reporting corps of @nytimes, @washingtonpost, @WSJ were expelled from China in 2020? 1/3
In🇦🇹this week @iclr_conf to present: 1⃣Privacy-preserving in-context learning with DP few-shot generation arxiv.org/abs/2309.11765 🖼️Tue 10:45 B-229 2⃣Can LLMs Keep a Secret? Testing Privacy Implications via Contextual Integrity arxiv.org/abs/2310.17884 🖼️ Wed 16:30 B-215
Some good news that I'm excited to share today: I've received an NSF CAREER award! This award will support my group's research on robust, fair, and culturally aware commonsense reasoning in natural language. Feeling grateful to my students & the many people who have supported me!
👏 Congrats to @rachelrudinger for receiving an @NSF CAREER award for her work on 'Robust, Fair, and Culturally Aware Commonsense Reasoning in Natural Language.' Looking forward to seeing how this research shapes the future! Read more: go.umd.edu/Rudinger-NSF-C…
Until recently, I really enjoyed the Bai drinks. They recently rebranded the drink as 'wonderwater' from 'antioxidant infusion' and shrunk the bottle, and charge the same price. It tastes different to me. I don't like it, I think it's worse now.
There’s always a relevant xkcd: m.xkcd.com/810/
WHAT: "AI-Curated Democratic Discourse," a JSALT hackathon team this summer (Jun 10-Aug 2) GOAL: Redesign the social media UI to raise the quality of reading and posting, with the help of LLMs🤯 WHO: Looking for 1 more funded, in-person NLP PhD student! DM me with yr skillz.
Undoubtedly, it is Raffel et al (2020), the T5 paper, which came earlier. You can see the beginnings of supervised instruction/prompting there (see red highlights) and it was the first ever paper to show you can use formulate any task as a text to text task and train multiple…
Stiennon et al. (2020). The first paper I know that used RLHF+PPO to finetune LLMs to follow human instructions (here: specifically summarizing texts). That was even 2 years (!) before InstructGPT. arxiv.org/abs/2009.01325
I know EACL was last month, but at least I did (eventually) do it😶🌫️ Read our paper: "Code-Switched Language Identification is Harder Than You Think" aclanthology.org/2024.eacl-long…
Me: exact inference is NP hard Someone somewhere: have you tried Mamba
Lots of great libraries here, not least of which is @PatrickKidger’s Equinox, which is one of the core libraries Levanter is built on. github.com/patrick-kidger…
Levanter from @StanfordCRFM has earned recognition at #GoogleCloudNext as a popular #Jax repository to build Foundation Models 🙌 There have been many great improvements to Levanter in the past few months led by @dlwh, particularly in achieving impressive MFU numbers on both TPU…
so do I tell this guy that a16z and andreessen horowitz are the same VC
I am very happy to announce our new work "FLawN-T5: An Empirical Examination of Effective Instruction-Tuning Data Mixtures for Legal Reasoning"! 📜: arxiv.org/abs/2404.02127 💾: huggingface.co/datasets/lawin… 🧵👇 1/7
great call out: 1. NN learned neither from observations nor from self-play. It learned from teacher: Non-neural program StockFish. 2. During training search was used indirectly (via StockFish), but not during actual play. Whether or not this counts as using search is debatable.
It is very tempting to read into the "grandmaster chess without search" result way beyond what's actually there. I was also mislead on first read, but further reading and reflection brought me down to earth. I wrote a bit about it. gist.github.com/yoavg/8b98bbd7…
One of the best pieces of advice I embodied from my advisor is “smell the data”. You pay for it in compute and other ways, if you don’t do it, and from my experience working with others, most don’t. That’s one of the reasons why we have overly complicated archs, objectives, and…
Glad you enjoyed it!
I really appreciate how dynamic and fun the @BlackboxNLP workshop was. Big credits to an incredibly awesome team of organizers @boknilev @najoungkim @notaphonologist @aryamccarthy @JumeletJ @hmohebbi75🎉💪! Also thankful to all the ~500 audience joining the 9 AM talk in person!