Dayeon (Zoey) Ki

@zoeykii

CS PhD @umdclip | MT, Multilingual, Cultural #NLProc | 🇰🇷🇨🇳🇨🇿🇺🇸

College Park, Maryland

Joined August 2022

469Following

331Followers

Pinned

Dayeon (Zoey) Ki Retweeted

Niyati Bafna@BafnaNiyati · Jul 4

📢When LLMs solve tasks with a mid-to-low resource input/target language, their output quality is poor. We know that. But can we pin down what breaks inside the LLM? We introduce the 💥translation barrier hypothesis💥 for failed multilingual generation. arxiv.org/abs/2506.22724

4.0K

Dayeon (Zoey) Ki@zoeykii · Jul 26

I will be presenting our work 𝗠𝗗𝗖𝘂𝗿𝗲 at #ACL2025NLP in Vienna this week! 🇦🇹 Come by if you’re interested in multi-doc reasoning and/or scalable creation of high-quality post-training data! 📍 Poster Session 4 @ Hall 4/5 🗓️ Wed, July 30 | 11-12:30 🔗 aclanthology.org/2025.acl-long.…

GGabrielle Kaili-May Liu@pybeebee · Nov 1

🔥Thrilled to introduce MDCure: A Scalable Pipeline for Multi-Document Instruction-Following 🔥 How can we systematically and scalably improve LLMs' ability to handle complex multi-document tasks? Check out our new preprint to find out! Details in 🧵 (1/n):

1.0K

Dayeon (Zoey) Ki Retweeted

Vishakh Padmakumar@vishakh_pk · Jul 25

Maybe don't use an LLM for _everything_? Last summer, I got to fiddle again with content diversity @AdobeResearch @Adobe and we showed that agentic pipelines that mix LLM-prompt steps with principled techniques can yield better, more personalized summaries

6.0K

Dayeon (Zoey) Ki Retweeted

Vered Shwartz@VeredShwartz · Feb 7

I'm excited to announce that my nonfiction book, "Lost in Automatic Translation: Navigating Life in English in the Age of Language Technologies", will be published this summer by Cambridge University Press. I can't wait to share it with you! 📖🤖 cambridge.org/core/books/los…

164

11.0K

Dayeon (Zoey) Ki@zoeykii · Jul 17

(Repost due to mistaken deletion😢): Evaluating topic models (& doc clustering methods) is hard. In fact, since our paper critiquing standard eval practices 4 years ago, there hasn't been a good replacement metric That ends today! Our ACL paper introduces a new evaluation🧵

AAlexander Hoyle@miserlis_ · Jul 8

How do standard metrics work? Automated coherence computes how often the top n words in a topic appear together in some reference text (eg, Wikipedia) This fails to consider which *documents* are associated with each topic, and so doesn't transfer well to text clustering methods

6.0K

Dayeon (Zoey) Ki Retweeted

Vilém Zouhar@zouharvi · Jul 15

You have a budget to human-evaluate 100 inputs to your models, but your dataset is 10,000 inputs. Do not just pick 100 randomly!🙅 We can do better. "How to Select Datapoints for Efficient Human Evaluation of NLG Models?" shows how.🕵️ (random is still a devilishly good baseline)

6.0K

Dayeon (Zoey) Ki Retweeted

LG AI Research@LG_AI_Research · Jul 15

📣Thrilled to announce the drop of EXAONE 4.0, the next-generation hybrid AI. 🙌Prepare to be amazed by EXAONE’s capabilities. #EXAONE #LG_AI_Resrarch #HybridAI #AI lgresearch.ai/blog/view?seq=…

11.0K

Dayeon (Zoey) Ki@zoeykii · Jul 8

CLIPPER has been accepted to #COLM2025! In this work, we introduce a compression-based pipeline to generate synthetic data for long-context narrative reasoning tasks. Excited to be in Montreal this October🍁

CChau Minh Pham@chautmpham · Feb 21

⚠️ Current methods for generating instruction-following data fall short for long-range reasoning tasks like narrative claim verification. We present CLIPPER✂️, a compression-based pipeline that produces grounded instructions for ~$0.5 each, 34x cheaper than human annotations.

4.0K

Dayeon (Zoey) Ki@zoeykii · Jun 27

Why should you attend this talk? 🤔 A. Nishant put so much effort B. Learn the real limitations of MCQA C. Great takeaways for building better benchmarks D. All of the above ✔️

NNishant is ill-prepared for ACL2025@NishantBalepur · Jun 27

Our position paper was selected for an oral at #ACL2025! Definitely attend if you want to hear spicy takes on why MCQA benchmarks suck and how education researchers can teach us to solve these problems 👀

1.0K

Dayeon (Zoey) Ki@zoeykii · Jun 25

Super grateful to share that our work has been accepted as #ACL2025 oral presentation 🍀✨ See you in Vienna! 🇦🇹

DDayeon (Zoey) Ki@zoeykii · Jun 12

1/ Are two #LLMs better than one for equitable cultural alignment? 🌍 We introduce a Multi-Agent Debate framework — where two LLM agents debate the cultural adaptability of a given scenario. #ACL2025 🧵👇

2.0K

Dayeon (Zoey) Ki Retweeted

Ricardo Rei@RicardoRei7 · Jun 23

🚀 Tower+: our latest model in the Tower family — sets a new standard for open-weight multilingual models! We show how to go beyond sentence-level translation, striking a balance between translation quality and general multilingual capabilities. 1/5 arxiv.org/pdf/2506.17080

2.0K

Dayeon (Zoey) Ki Retweeted

Chaitanya Malaviya@cmalaviya11 · Jun 6

Ever wondered what makes language models generate overly verbose, vague, or sycophantic responses? Our new paper investigates these and other idiosyncratic biases in preference models, and presents a simple post-training recipe to mitigate them! Thread below 🧵↓

14.0K