Zephyr
@zephyr_z9
DMs are open
My Third Post Implications of the H20 It will have profound change on the dynamics of US-China AI race especially in the RL/inference age

I'm long tech Cos Was short retail till Trump TACO
respect, whats your current bets?
Bytedance, Whale, and Zhipu are top contenders Could be a government-backed lab as well
My Lord?
I made a lot of money by buying Nvidia in 2018 & Broadcom in 2022, and shorting Tesla in December 2023
share a piece of investing lore about yourself
In 2016, I shared an elevator with the CFO of Wirecard in NY post a group meeting MS hosted with a room full of rabid short sellers post some high profile fraud allegations. We were short the stock - in size - because we thought they were probably a fraud and that they would at…
share a piece of investing lore about yourself
AI is very disruptive for policy research. Personally, my remaining alpha is almost entirely my: 1. network, 2. non-public info, 3. IRL persuasion ability, 4. platform. Less so my reasoning, writing ability, or mastery of public info, which are available in abundance to all.
First, it was Yookay, now Australia too Is this a roundabout way of targeting American tech companies now that Trump will block direct fines??
Soon, you will need an ID to browse the internet, play video games, or watch movies, regardless of your country.
This could be very interesting
✨ Second day of WAIC: SmallThinker🔥 a no GPU needed on-device MoE language models, built from the ground up for local AI, by @sjtu1896 and Zenergize AI. huggingface.co/PowerInfer/Sma… huggingface.co/PowerInfer/Sma… ✨ 4B (0.6B active)/ 21B (3B active) ✨ Blazing-fast CPU inference…
Why do FFNs use ReLU instead of more precise ones like Exp? "We propose the following hypothesis: A kernel with lower retrieval precision encourages a more polysemantic key–value memory: multiple unrelated facts can be stored under the same key space" Great and inspiring read!
TEMU HAS BEEN TOLD BY US COMPANIES AND SELLERS THAT IT CANNOT PROVIDE CHEAPER PRICES ON BRANDED PRODUCTS THAN THOSE OFFERED ON AMAZON - FT
TEMU HAS BEEN TOLD BY US COMPANIES AND SELLERS THAT IT CANNOT PROVIDE CHEAPER PRICES ON BRANDED PRODUCTS THAN THOSE OFFERED ON AMAZON - FT
These 4 will be the new stuff incorporated in V4/R2, along with more parameters & training data, and increased sparsity
ded rong NSA & SPCT GRM is far more impressive compared to Prover V2 What they haven't released this year is a new optimizer (I don't think they will use AdamW, especially when Muon exists), and a better RL algo GRPO has some issues and whale isn't as compute constrained anymore
ded rong NSA & SPCT GRM is far more impressive compared to Prover V2 What they haven't released this year is a new optimizer (I don't think they will use AdamW, especially when Muon exists), and a better RL algo GRPO has some issues and whale isn't as compute constrained anymore
In general Washington obsesses overmuch on DeepSeek but I find it interesting to observe that they have published far fewer novel papers this year than they did in 2024. Yes, r1 went viral in 2025. But r1-preview came out in fall 2024, and v3 came out at the tail end of the…
Phase 1 of Physics of Language Models code release ✅our Part 3.1 + 4.1 = all you need to pretrain strong 8B base model in 42k GPU-hours ✅Canon layers = strong, scalable gains ✅Real open-source (data/train/weights) ✅Apache 2.0 license (commercial ok!) 🔗github.com/facebookresear…
(1/8)🍎A Galileo moment for LLM design🍎 As Pisa Tower experiment sparked modern physics, our controlled synthetic pretraining playground reveals LLM architectures' true limits. A turning point that might divide LLM research into "before" and "after." physics.allen-zhu.com/part-4-archite…
Very cool
We're thrilled to release & open-source Hunyuan3D World Model 1.0! This model enables you to generate immersive, explorable, and interactive 3D worlds from just a sentence or an image. It's the industry's first open-source 3D world generation model, compatible with CG pipelines…
Huawei Ascend 910C chip analyze TSMC N7 Family Same chip with 910B Video on Youtube: youtu.be/0feji7gMrh8?si…