Mikyo
@mikeldking
Building the future of LLMOps. Head of Open-Source at @arizeai @ArizePhoenix | Former eng @apple ☕️🚲🧗🏻♂️🏂⛰🕴
Two recent quotes changed the way I think about developing code: 🎤 “One of the risks I see is fossilization — of the libraries and tools we use.” – Laurie Voss from @llama_index (linkedin.com/posts/auth0_de…) 🎤 "...they're computers but they are humanlike [...] there's people…
![mikeldking's tweet image. Two recent quotes changed the way I think about developing code:
🎤 “One of the risks I see is fossilization — of the libraries and tools we use.” – Laurie Voss from @llama_index (linkedin.com/posts/auth0_de…)
🎤 "...they're computers but they are humanlike [...] there's people…](https://pbs.twimg.com/media/GwPwi7YXIAAFIHe.jpg)
![mikeldking's tweet image. Two recent quotes changed the way I think about developing code:
🎤 “One of the risks I see is fossilization — of the libraries and tools we use.” – Laurie Voss from @llama_index (linkedin.com/posts/auth0_de…)
🎤 "...they're computers but they are humanlike [...] there's people…](https://pbs.twimg.com/media/GwPwi7LXYAIpobo.jpg)
![mikeldking's tweet image. Two recent quotes changed the way I think about developing code:
🎤 “One of the risks I see is fossilization — of the libraries and tools we use.” – Laurie Voss from @llama_index (linkedin.com/posts/auth0_de…)
🎤 "...they're computers but they are humanlike [...] there's people…](https://pbs.twimg.com/media/GwPwi7LW4AI0vTl.jpg)
![mikeldking's tweet image. Two recent quotes changed the way I think about developing code:
🎤 “One of the risks I see is fossilization — of the libraries and tools we use.” – Laurie Voss from @llama_index (linkedin.com/posts/auth0_de…)
🎤 "...they're computers but they are humanlike [...] there's people…](https://pbs.twimg.com/media/GwPwi7NW8AQTn7M.jpg)
📈 @ArizePhoenix now has project dashboards! In the latest release @arizeai Phoenix comes with a dedicated project dashboard with: 📈 Trace latency and errors 📈 Latency Quantiles 📈 Annotation Scores Timeseries 📈 Cost over Time by token type 📊 Top Models by Cost 📊 Token…
Wild.
New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵
This is a great post by @sanjanayed and aligns well with what @HamelHusain and @sh_reya pitch in their evals course as well. You don't want to outsource your annotations. It makes a lot of sense to use tools that let you build your own annotation tools (using @v0, @lovable_dev…
Just wrapped up a tutorial - I use a custom annotations tool to build an end-to-end evaluation & experimentation pipeline🚀 Inspired by an article from @eugeneyan, I explore how to leverage annotations to construct evals, design thoughtful experiments, and systematically improve…
The biggest thing preventing @cursor_ai and Claude Code from being magical for me is that they forget to run code formatters like prettier and ruff. I always have to intervene at the end.
TIL about jsonc - how did it take me so long. Now can we get package.jsonc ?
So many enterprises have been asking for LLM tracing support in Java and Kotlin. Excited to make this happen via @ArizePhoenix #oss
🚀 Introducing OpenInference Java! We're excited to announce the launch of OpenInference Java, a comprehensive solution for tracing AI applications using OpenTelemetry This is fully compatible with any OpenTelemetry compatible collector or backend! 📦 What’s included: ✅…
The deer in headlights look in the eyes of interview candidates that are addicted to cursor.
"I use AI in a separate window. I don't enjoy Cursor or Windsurf, I can literally feel competence draining out of my fingers." @dhh, the legendary programmer and creator of Ruby on Rails has the most beautiful and philosophical idea about what AI takes away from programmers.
Vibe coding? Try vibe understanding the generated code before you tab tab tab or press accept first
AI for code editing is not perfect and often times frustrating. But as an OSS maintainer, there are things that are starting to come into view. What tools like @cursor_ai can help you do is shift your role - rather than the code producer, you become the compiler. The AI takes…
I created an annotation version of @beirmug 's presentation on IR Evals for RAG Nandan argues that we should consider additional retrieval metrics beyond the classics (MRR, etc) b/c retrieval goals for RAG can sometimes be very different hamel.dev/notes/llm/rag/…
PSA: don’t use products that were sold to you, use the ones that you genuinely love and enjoy
Aravind Srinivas: “have the intellectual humility” to know what you’re good and not good at. youtu.be/2jOnoTEk-xA?si… I also love that he triages bugs. 🐛
It’s happening! #Golang github.com/modelcontextpr…
Tracing and telemetry traditionally has been an operational requirement, not a development one. But I've found that with AI applications this fundamentally changes. Take LLM-as-a-judge for example. You might pick an off-the-shelf eval library and trust that it works. But this…
I’ve wondered this myself. How do you even evaluate this? @cursor_ai add ways to do pairwise evals or something
For those of you using Cursor / Windsurf / Zed etc... Has anyone wrote evals for a rules file or is it all vibes based? There's some good examples on cursor.directory but trying to figure out the right balance of length / examples.
Using the tool I helped build to build the tool I'm building. This is the way. #Oss
