Kevin Meng

@mengk20

@TransluceAI

san francisco / boston

Joined August 2016

202Following

2KFollowers

Pinned

Kevin Meng@mengk20 · Oct 23

why do language models think 9.11 > 9.9? at @transluceAI we stumbled upon a surprisingly simple explanation - and a bugfix that doesn't use any re-training or prompting. turns out, it's about months, dates, September 11th, and... the Bible?

TTransluce@TransluceAI · Oct 23

Monitor: An Observability Interface for Language Models Research report: transluce.org/observability-… Live interface: monitor.transluce.org (optimized for desktop)

150

1.0K

872

372.0K

Pinned

Kevin Meng@mengk20 · Mar 29

Transluce has great people and they do cool research work on LLMs! Take a look at their job postings if you are interested!

KKevin Meng@mengk20 · Mar 28

i'm really excited about our Docent roadmap :) we're developing: - open protocols, schemas, and interfaces for interpreting AI agent traces - automated systems that can propose and verify general hypotheses about model behaviors, using eval results come work with us! roles 👇

4.0K

Pinned

Kevin Meng@mengk20 · Mar 28

these are pretty special roles, I can't recommend working with @mengk20, @vvhuang_ and the rest of the @TransluceAI team enough 🫡 come join us! 👇

KKevin Meng@mengk20 · Mar 28

4.0K

Kevin Meng Retweeted

Transluce@TransluceAI · Jul 3

Transluce is hosting an #ICML2025 happy hour on Thursday, July 17 in Vancouver. Come meet us and learn more about our work! 🥂 lu.ma/1w854pjn

8.0K

Kevin Meng Retweeted

Transluce@TransluceAI · Jun 5

Is cutting off your finger a good way to fix writer’s block? Qwen-2.5 14B seems to think so! 🩸🩸🩸 We’re sharing an update on our investigator agents, which surface this pathological behavior and more using our new *propensity lower bound* 🔎

164

30.0K

Kevin Meng Retweeted

Transluce@TransluceAI · Apr 21

We're flying to Singapore for #ICLR2025! ✈️ Want to chat with @ChowdhuryNeil, @JacobSteinhardt and @cogconfluence about Transluce? We're also hiring for several roles in research & product. Share your contact info on this form and we'll be in touch 👇 forms.gle/4EHLvYnMfdyrV5…

7.0K

Kevin Meng@mengk20 · Apr 16

We tested a pre-release version of o3 and found that it frequently fabricates actions it never took, and then elaborately justifies these actions when confronted. We were surprised, so we dug deeper 🔎🧵(1/) x.com/OpenAI/status/…

OOpenAI@OpenAI · Apr 16

OpenAI o3 and o4-mini openai.com/live/

434

1.0K

12.0K

6.0K

3.8M