Wey Gu 古思为
@wey_gu
I build things, troubleshoot & sing. GenAI Lead @NebulaGraph. Microsoft MVP. Co-host @osf2f. Feminist. Prev @Ericsson Cloud. Building @nowledgelabs at night.
🚀 非常开心 NebulaGraph/悦数 Team 收到 Arijit 教授 @rijitK /Xi Chen 老师的 @VLDB2024 KG-LLM WS 邀请,介绍 GraphRAG 的工作。 📚 我和 @beautyyuyanli 会代表 #NebulaGraph Team 在下周一下午 VLDB2024 分享 NebulaGraph RAG 的创新、洞察。 👋🏻 #VLDB2024-广州见 seucoin.github.io/workshop/llmkg…



I should have maintained task list summaries.👍
This is a quick way to implement features in Cursor. - Use a New Chat - Use past chats as a lightweight summary - Link your Tasklist - Use file references to tell Cursor where to make changes
Nail the hardest part first. x.com/wey_gu/status/…
complexity first, simplicity second people say “keep it simple,” but most approach it backwards. they start from simple, then add on complexity without seeing the whole. that’s how you end up with frankenstein products: clean-looking components awkwardly stitched together, held…
👍
Compared to GRPO, GSPO offers significant advantages in stability, efficiency, performance, and infra-friendliness. Furthermore, it fundamentally and naturally resolves the stability issues in the RL training of large MoE models 💪
good night!
gotta sleep early. tmr (oh i should say today) is qwen3-235b-a22b-thinking-2507 if everything goes well.
Super interesting. It’s a model that working in a memory layer. It is vision first. I guess it remembers yet forget. it’s a system rather than a pure “model” model. Indeed it’s pushed further in memory system/layer. Can’t wait to have a close look! cc @yetone @bozhao
I’m Shawn, founder of Memories.ai, former researcher at Meta and CS PhD at University of Cambridge. Today we’re launching : we built the world’s first Large Visual Memory Model - to give AI human-like visual memories. Why visual memory? AI to…
Pydantic is hiring a Rust engineer to join our database team. We are building a cutting edge Rust database for observability on top of DataFusion. This is a remote role with great benefits and an amazing team. Please reach out with any questions! pydantic.dev/jobs/rust
in real world as of now: pair coding rather than vibe coding. except for those clearly-scoped, non-spread, small-sized changes.
for me it's still cursor. my workflow is more pair coding with cursor and pair/vibe design with v0 dot dev. in study phase, i talked to gemini pro/o4 mini/claude over my upstream projects within github copilot; run deep research over chatgpt/gemini; then i started writing RFC…
Thanks! avante.nvim has introduced Beast Mode 3.1
Beast Mode 3.1 is here 🐻 Improvements... * More agentic coaxing * Memory * Reduce file/folder re-reads * Git from the agent * Perf tweaks Link in comments because thanks X algo 👇
omg that’s the dgl team. Looking forward to what dr. minjie will build after this journey.
🚨Breaking🚨 Minjie Wang, Chief Applied Scientist at Amazon Web Services AI Shanghai Lablet, posted on his WeChat Moments that they had just received notice of the official dissolution of the AWS AI Shanghai Lablet—AWS's last overseas research facility.
🤩
Run Qwen3-Coder with our Dynamic 2-bit GGUFs! ⭐ We shrank the 480B parameter model to just 182GB (down from 512GB). Also, run with 1M context length. Achieve >6 tokens/s on 182GB unified memory or 158GB RAM + 24GB VRAM. Qwen3-Coder-480B-A35B GGUFs: huggingface.co/unsloth/Qwen3-…
game changer
>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…
Thanks @wey_gu 🙏 This is just a beginning! 🤗 More to come soon: huggingface.co/OpenMed
Great progress 35s to 2s. Actually I was really surprised to run SGLang the first time for long with vLLM with same transformer cached model by how fast started to serve.
vLLM is finally addressing a long-standing problem: startup times 35s -> 2s for CUDA graph capture is a great reduction!
因为 qwen3 的新 moe 模型,offload 部署需要大 ram,我给电老虎 server 内存买满了,发现 32 gib 的一条已经下探到 75😅,装机的时候商家那里拿好像是 300 左右。 192 -> 256 GiB now
我之前 cpp 编译受不了搞了一个台式,72 thread(HT) 192 GiB RAM(还没插满),感觉很划算。 haswell 的洋垃圾 CPU 和 RAM, 新主板,一共 5000 多
Now I am running @nowledgelabs at night and my free time only.(alpha launch soon!) Hopefully will make it my daytime soon!
Still more works to do! #buildinpublic
这个工作太棒了,也给出了这类 open domain 细分的知识工作分发协作的最佳事件。 looking forward to the merged single ner model!
🚀 Big news in healthcare AI! I'm thrilled to announce the launch of OpenMed on @huggingface, releasing 380+ state-of-the-art medical NER models for free under Apache 2.0. And this is just the beginning! 🧵
赞! 记得 antfu 老师线下时候曾说过,他最初是有意把 slidev 做得没有那么非编程人员友好的,毕竟有一个 dev 的 substring 在名字里,后来它的收欢迎程度超出了预期。 还有 x.com/HiSuneBear/sta…
早上 vibe 了一个 PPT 生成器,为啥一直没人用 antfu 老师的 Slidev 来做 PPT,我这就试试。