Katherine Thai

@kthai1618

I did CS, Math, and English at @rutgersu. Now I do NLP as a PhD student @UMass_NLP and @pangramlabs (from NYC). she/her

Joined November 2011

196Following

416Followers

Pinned

Katherine Thai@kthai1618 · Apr 20, 2018

my life is just a repeating sequence of this scene: at hibachi, chef does the veggie toss thing, it hits me in the face, I require therapy

Pinned

Katherine Thai Retweeted

Marzena Karpinska@mar_kar_ · Apr 2

We have updated #nocha, a leaderboard for reasoning over long-context narratives 📖, with some new models including #Gemini 2.5 Pro which shows massive improvements over the previous version! Congrats to #Gemini team 🪄 🧙 Check 🧵for 🔗 to the website :)

4.0K

Pinned

Katherine Thai@kthai1618 · Jun 25, 2024

My first PhD paper started with something like “Because it’s impossible to feed an entire novel to a language model, here’s what we did instead.” 😂 Anyway, here’s NEW work on tricky long context (& often subtle!) claim verification. I personally read 3 novels for the dataset 📚

MMarzena Karpinska@mar_kar_ · Jun 25, 2024

Can #LLMs truly reason over loooong context? 🤔 NoCha asks LLMs to verify claims about *NEW* fictional books 🪄 📚 ⛔ LLMs that solve needle-in-the-haystack (~100%) struggle on NoCha! ⛔ None of 11 tested LLMs reach human performance → 97%. The best, #GPT-4o, gets only 55.8%.

3.0K

Pinned

Katherine Thai@kthai1618 · Nov 15, 2023

🎉🎉🎉 So happy this is finally out there!!!

MMarzena Karpinska@mar_kar_ · Nov 15, 2023

📢Happy to announce litmt.org, a platform for sharing and commenting on LLM-generated translations of novels into over 20 target languages. LitMT aims to make previously-untranslated world literature accessible beyond language barriers. 📚 1/6

700

Pinned

Katherine Thai@kthai1618 · Aug 20, 2023

I was napping for like 80% of the time Ben spent building this

MMarzena Karpinska@mar_kar_ · Nov 15, 2023

1.0K

Katherine Thai@kthai1618 · Jun 27

I’m learning so much and having the best time with @pangramlabs this summer. Congrats to the team!

MMax Spero@max_spero_ · Jun 25

Thrilled to announce that @pangramlabs closed its seed round, bringing our total raised to $4M! I'm so excited to continue our mission serving schools, internet platforms, and more with incredibly accurate AI detection technology.

1.0K

Katherine Thai Retweeted

Chau Minh Pham@chautmpham · Jun 3

🤔 What if you gave an LLM thousands of random human-written paragraphs and told it to write something new -- while copying 90% of its output from those texts? 🧟 You get what we call a Frankentext! 💡 Frankentexts are surprisingly coherent and tough for AI detectors to flag.

119

15.0K

Katherine Thai Retweeted

Lucy Li@lucy3_li · May 30

"Tell, Don't Show" was accepted to #ACL2025 Findings! Our simple approach for literary topic modeling combines the new (language models) with the old (classic LDA) to yield better topics. A possible addition to your CSS/DH research 🛠️ box ✨📚 arxiv.org/abs/2505.23166

127

11.0K

Katherine Thai Retweeted

Yapei Chang@YapeiChang · May 20

🤔 Can simple string-matching metrics like BLEU rival reward models for LLM alignment? 🔍 We show that given access to a reference, BLEU can match reward models in human preference agreement, and even train LLMs competitively with them using GRPO. 🫐 Introducing BLEUBERI:

194

134

79.0K

Katherine Thai@kthai1618 · May 9

What is @pangramlabs and how is it different than other AI detection tools out there? And how do we convince a skeptic of this? Our false positive rate is roughly 1 in 10,000. How do we assess this number how can an academic trust our numbers? 1/

EEthan Mollick@emollick · May 9

Getting lots of replies pushing Pangram Labs here. They claim very low false positive rates on their website. I remain doubtful without independent assessment of false positives (this study was not meant to do that), & concerned that these detectors are used adversarially.

40.0K

Katherine Thai@kthai1618 · May 2

After a great team lunch (ft. mango sticky rice) and successful brainstorming sesh, I’m so excited to start working with @pangramlabs for the summer as a research scientist intern! Hard not to be excited when you hear the team talk about what they’re working on—stay tuned :)

2.0K

Katherine Thai Retweeted

Yixiao Song@yixiao_song · Mar 12

Introducing 🐻 BEARCUBS 🐻, a “small but mighty” dataset of 111 QA pairs designed to assess computer-using web agents in multimodal interactions on the live web! ✅ Humans achieve 85% accuracy ❌ OpenAI Operator: 24% ❌ Anthropic Computer Use: 14% ❌ Convergence AI Proxy: 13%

12.0K

Katherine Thai@kthai1618 · Mar 12

Super cool project I contributed to where we evaluated a bunch of web browsing agents on really interesting multimodal challenges—I got to watch agents try to play some silly flash games and accidentally click on an ad for pants 😂

YYixiao Song@yixiao_song · Mar 12

179

Katherine Thai Retweeted

Chau Minh Pham@chautmpham · Feb 21

⚠️ Current methods for generating instruction-following data fall short for long-range reasoning tasks like narrative claim verification. We present CLIPPER✂️, a compression-based pipeline that produces grounded instructions for ~$0.5 each, 34x cheaper than human annotations.

110

17.0K

Katherine Thai Retweeted

Jenna Russell@jennajrussell · Jan 28

People often claim they know when ChatGPT wrote something, but are they as accurate as they think? Turns out that while general population is unreliable, those who frequently use ChatGPT for writing tasks can spot even "humanized" AI-generated text with near-perfect accuracy 🎯

160

1.0K

405

168.0K

Katherine Thai Retweeted

Mohit Iyyer@MohitIyyer · Aug 27

Summer update to our NoCha long-context LLM leaderboard! Main highlight: LLaMA3.1 405B is the first (and currently only) open-weight model to convincingly beat the random guessing baseline of 25%, ranking at #5 overall! novelchallenge.github.io

6.0K

Katherine Thai@kthai1618 · May 7, 2024

I hate those airport security setups where the bins could really use a scheduler so much!!!

253

Katherine Thai@kthai1618 · Mar 13, 2024

I just went to my first craft circle and people recognized me from my fiber arts instagram and it felt ever better than when people recognize me at conferences

317

Katherine Thai@kthai1618 · May 28, 2023

Karma is not having any deadlines last night 💜 Thanks to @taylorswift13 for soundtracking my academic career from middle schooler to PhD student. #EastRutherfordTSTheErasTour

kthai1618's tweet image. Karma is not having any deadlines last night 💜 Thanks to @taylorswift13 for soundtracking my academic career from middle schooler to PhD student. #EastRutherfordTSTheErasTour

3.0K

Katherine Thai@kthai1618 · May 4, 2023

Interviewed a new therapist who has never heard of ChatGPT. As an NLP grad student is this good or bad ?

2.0K

Katherine Thai@kthai1618 · Mar 9, 2023

my friend hired a couch assembly task rabbit who did it but complained the entire time that it was hard and they “could have been out with friends instead” and my friend was like “you willingly accepted this job ???” but as a PhD student I totally get it

954