Ziyu Yao
@ZiyuYao
Asst Prof @GeorgeMasonU CS interested in #NLProc #AI. Alum @OhioState. Prev intern @LTIatCMU @MSFTResearch @FujitsuAmerica @Tsinghua_Uni.
Happy to make the progress with my wonderful students in 2024 (reasoning&planning, LLM interpretability, human-LLM interaction)! Look forward to an exciting and fruitful 2025๐๐(Welcome collaborations as always! PhD openings for Fallโ25)


Tutorial happening in a minute at West Exhibit Hall C! @DakingRai
Happy to announce that we (w/ my student @DakingRai ) will present a tutorial on ๐๐๐๐ก๐๐ง๐ข๐ฌ๐ญ๐ข๐ ๐๐ง๐ญ๐๐ซ๐ฉ๐ซ๐๐ญ๐๐๐ข๐ฅ๐ข๐ญ๐ฒ ๐๐จ๐ซ ๐๐๐ง๐ ๐ฎ๐๐ ๐ ๐๐จ๐๐๐ฅ๐ฌ! Look forward to meeting people @icmlconf Stay tuned! ziyu-yao-nlp-lab.github.io/ICML25-MI-Tutoโฆ @GeorgeMasonU @GMUCompSci
โณDeadline extended! The submission deadline for XLLM-Reason-Plan has been moved to June 27th. More time to submit your work โ we look forward to your submissions! Details: โฆreasoning-planning-workshop.github.io
If you work in the space of LLM explainability, submit your completed/ongoing/recently accepted/under review work to us! Your chance to win awards๐ @XllmReasonPlan
๐จDeadline alert: If you work on LLM explainability for reasoning and planning, submit your work by June 23! - Non-archival, two formats (long/short) - Welcome recently accepted papers and dual submissions - ๐Two awards will be announced! Details: โฆreasoning-planning-workshop.github.io
๐จDeadline alert: If you work on LLM explainability for reasoning and planning, submit your work by June 23! - Non-archival, two formats (long/short) - Welcome recently accepted papers and dual submissions - ๐Two awards will be announced! Details: โฆreasoning-planning-workshop.github.io
Had a great time at this CVPR community-building workshop---lots of fun discussions and some really important insights for early-career researchers. I also gave a talk on "Research as an Infinite Game." Here are the slides: canva.com/design/DAGp0iRโฆ
In this #CVPR2025 edition of our community-building workshop series, we focus on supporting the growth of early-career researchers. Join us tomorrow (Jun 11) at 12:45 PM in Room 209 Schedule: sites.google.com/view/standoutcโฆ We have an exciting lineup of invited talks and candidโฆ
Check out our #CVPR25 paper! @aghzalm has done a series of work on ๐๐๐ ๐๐ฅ๐๐ง๐ง๐ข๐ง๐ uniquely from LLM + Robotics perspective: arxiv.org/pdf/2310.03249 arxiv.org/pdf/2502.12435 arxiv.org/pdf/2406.12000 Go talk to him if you are also on this topic! @GMUCompSci @GeorgeMasonU
[1/6] LLMs/VLMs aren't reliable plannersโcan they evaluate plans? ๐ค Our #CVPR2025 paper tests this in path planning. We find that VLMs show weak low-level perception & hallucinated reasoning. ๐ arxiv.org/abs/2411.18711 ๐ huggingface.co/datasets/maghzโฆ ๐ Fri Jun 13 4-6โฏPM @ ExHall D
๐ We're excited to announce three more amazing speakers for the @XllmReasonPlan at @COLM_conf! @zhuzining , @mark_riedl, @hhsun1. More info about workshop: โฆreasoning-planning-workshop.github.io
๐ขAnnouncing ๐ญ๐ก๐ ๐๐ข๐ซ๐ฌ๐ญ ๐ฐ๐จ๐ซ๐ค๐ฌ๐ก๐จ๐ฉ ๐จ๐ง ๐ญ๐ก๐ ๐๐ฉ๐ฉ๐ฅ๐ข๐๐๐ญ๐ข๐จ๐ง ๐จ๐ ๐๐๐ ๐๐ฑ๐ฉ๐ฅ๐๐ข๐ง๐๐๐ข๐ฅ๐ข๐ญ๐ฒ ๐ญ๐จ ๐๐๐๐ฌ๐จ๐ง๐ข๐ง๐ ๐๐ง๐ ๐๐ฅ๐๐ง๐ง๐ข๐ง๐ at @COLM_conf! We welcome perspectives from LLM, XAI, and HCI! CFP (Due June 23): โฆreasoning-planning-workshop.github.io
๐คฉCheck out our amazing line-up of speakers โฆreasoning-planning-workshop.github.io covering topics: explainability for reasoning, agent safety, human-AI interaction, mechanistic interpretability, and MORE!!
๐ We're excited to announce three more amazing speakers for the @XllmReasonPlan at @COLM_conf! @zhuzining , @mark_riedl, @hhsun1. More info about workshop: โฆreasoning-planning-workshop.github.io
๐ฃ Call for Reviewers We're looking for reviewers with expertise in LLM interpretability, reasoning, or planning for @XllmReasonPlan at @COLM_conf . If youโre Interested, sign up here: forms.office.com/r/Z7qXnEKZne ๐ Review period: June 24 โ July 7 ๐ Load: 2โ3 papers
๐ขAnnouncing ๐ญ๐ก๐ ๐๐ข๐ซ๐ฌ๐ญ ๐ฐ๐จ๐ซ๐ค๐ฌ๐ก๐จ๐ฉ ๐จ๐ง ๐ญ๐ก๐ ๐๐ฉ๐ฉ๐ฅ๐ข๐๐๐ญ๐ข๐จ๐ง ๐จ๐ ๐๐๐ ๐๐ฑ๐ฉ๐ฅ๐๐ข๐ง๐๐๐ข๐ฅ๐ข๐ญ๐ฒ ๐ญ๐จ ๐๐๐๐ฌ๐จ๐ง๐ข๐ง๐ ๐๐ง๐ ๐๐ฅ๐๐ง๐ง๐ข๐ง๐ at @COLM_conf! We welcome perspectives from LLM, XAI, and HCI! CFP (Due June 23): โฆreasoning-planning-workshop.github.io
๐ฅณWe are organizing a workshop at COLM to discuss the research gap about applying explainability/interpretability to enhance LLMs in challenging reasoning and planning tasks! Check out our tentative schedule at โฆreasoning-planning-workshop.github.io **Submit your excellent work to us!**
๐ขAnnouncing ๐ญ๐ก๐ ๐๐ข๐ซ๐ฌ๐ญ ๐ฐ๐จ๐ซ๐ค๐ฌ๐ก๐จ๐ฉ ๐จ๐ง ๐ญ๐ก๐ ๐๐ฉ๐ฉ๐ฅ๐ข๐๐๐ญ๐ข๐จ๐ง ๐จ๐ ๐๐๐ ๐๐ฑ๐ฉ๐ฅ๐๐ข๐ง๐๐๐ข๐ฅ๐ข๐ญ๐ฒ ๐ญ๐จ ๐๐๐๐ฌ๐จ๐ง๐ข๐ง๐ ๐๐ง๐ ๐๐ฅ๐๐ง๐ง๐ข๐ง๐ at @COLM_conf! We welcome perspectives from LLM, XAI, and HCI! CFP (Due June 23): โฆreasoning-planning-workshop.github.io
BlackboxNLP will be co-located with #EMNLP2025 in Suzhou this November! ๐ทThis edition will feature a new shared task on circuits/causal variable localization in LMs, details: blackboxnlp.github.io/2025/task If you're into mech interp and care about evaluation, please submit!
That's a wrap for #ICLR2025! See you all next year in Brazil! Please all welcome @BharathHarihar3 as the new Senior Program Chair! (With @cvondrick continuing on as General Chair.)
๐จ Weโre hiring a postdoc! Join us to push the frontier of AI and machine learning in genomicsโaiming to uncover the genetic basis of complex human disorders. Please help RT! ๐. For more information, please visit: haohanwang.github.io/postdoc_hiringโฆ
Proud to share exploration w/ my student @salokr_deep abt ๐๐จ ๐๐๐ซ๐ ๐ ๐๐๐๐ฌ๐จ๐ง๐ข๐ง๐ ๐๐จ๐๐๐ฅ๐ฌ ๐ฌ๐ญ๐ข๐ฅ๐ฅ ๐ง๐๐๐ ๐๐ซ๐จ๐ฆ๐ฉ๐ญ ๐๐ฉ๐ญ๐ข๐ฆ๐ข๐ณ๐๐ญ๐ข๐จ๐ง? arxiv.org/pdf/2504.07357 Key๐ก: vs. LLMs, LRMs benefit more from prompt opt and are also better prompt optimizers.
๐จNew Preprint๐จ (1/n) Do SOTA #LRMs like #DeepSeekR1 and #o1 still need Prompt Optimization? We put them to the test on a structured task, Event Extraction, and did the first deep dive into prompt optimization. We found: Yes, they do benefit from it. #NLProc #LLMs #LRMs A๐งต
๐After a year of development based on our OSWorld, Computer Use Agent Arena is LIVE! Test top AI agents (Operator, Claude 3.7...) on any kinds of computer use tasks with zero setup. Cloud-hosted, safe, and FREE! Try it now: arena.xlang.ai ! Data & code coming soon!
๐ฎ Computer Use Agent Arena is LIVE! ๐ ๐ฅ Easiest way to test computer-use agents in the wild without any setup ๐ Compare top VLMs: OpenAI Operator, Claude 3.7, Gemini 2.5 Pro, Qwen 2.5 vl and more ๐น๏ธ Test agents on 100+ real apps & webs with one-click config ๐ Safe & freeโฆ
We just released Version 2 of our survey on ๐๐๐๐ก๐๐ง๐ข๐ฌ๐ญ๐ข๐ ๐๐ง๐ญ๐๐ซ๐ฉ๐ซ๐๐ญ๐๐๐ข๐ฅ๐ข๐ญ๐ฒ, ext. to ๐๐ ๐ฉ๐๐ ๐๐ฌ! arxiv.org/pdf/2407.02646 Major updates๐งต: x.com/DakingRai/statโฆ We received positive feedback from V1 and are excited to see it become a helpful guide.
[1/6] Mechanistic Interpretability (MI) is an emerging sub-field of interpretability that aims to understand LMs by reverse-engineering its underlying computation. Here we present a comprehensive survey curated specifically as a ๐ ๐ฎ๐ข๐๐ ๐๐จ๐ซ ๐ง๐๐ฐ๐๐จ๐ฆ๐๐ซ๐ฌ ๐ญ๐จ ๐ญ๐ก๐ข๐ฌโฆ
๐ข#ACL2025NLP This year we received 8276 submissions ๐ which is the highest number in the history of ACL conferences ๐ If you are not yet involved as a reviewer, AC or SAC, we would encourage you to volunteer as an (emergency) AC or reviewer forms.gle/u5C2Daq1Mz9kXwโฆ ๐
I interviewed for LLM/ML research scientist/engineering positions last Fall. Over 200 applications, 100 interviews, many rejections & some offers later, I decided to write the process down, along with the resources I used. Links to the process & resources in the following tweets