DeepSeek

@deepseek_ai

Unravel the mystery of AGI with curiosity. Answer the essential question with long-termism.

Joined October 2023

0Following

971KFollowers

Pinned

DeepSeek@deepseek_ai · Jan 28

To prevent any potential harm, we reiterate that @deepseek_ai is our sole official account on Twitter/X. Any accounts: - representing us - using identical avatars - using similar names are impersonations. Please stay vigilant to avoid being misled!

4.0K

6.0K

78.0K

4.0K

8.1M

DeepSeek@deepseek_ai · May 29

🚀 DeepSeek-R1-0528 is here! 🔹 Improved benchmark performance 🔹 Enhanced front-end capabilities 🔹 Reduced hallucinations 🔹 Supports JSON output & function calling ✅ Try it now: chat.deepseek.com 🔌 No change to API usage — docs here: api-docs.deepseek.com/guides/reasoni… 🔗…

deepseek_ai's tweet image. 🚀 DeepSeek-R1-0528 is here!

🔹 Improved benchmark performance
🔹 Enhanced front-end capabilities
🔹 Reduced hallucinations
🔹 Supports JSON output &amp; function calling

✅ Try it now: chat.deepseek.com
🔌 No change to API usage — docs here: api-docs.deepseek.com/guides/reasoni…
🔗…

528

2.0K

10.0K

1.0K

1.4M

DeepSeek@deepseek_ai · Mar 25

🚀 DeepSeek-V3-0324 is out now! 🔹 Major boost in reasoning performance 🔹 Stronger front-end development skills 🔹 Smarter tool-use capabilities ✅ For non-complex reasoning tasks, we recommend using V3 — just turn off “DeepThink” 🔌 API usage remains unchanged 📜 Models are…

deepseek_ai's tweet image. 🚀 DeepSeek-V3-0324 is out now!

🔹 Major boost in reasoning performance
🔹 Stronger front-end development skills
🔹 Smarter tool-use capabilities

✅ For non-complex reasoning tasks, we recommend using V3 — just turn off “DeepThink”
🔌 API usage remains unchanged
📜 Models are…

693

2.0K

12.0K

2.0K

1.5M

DeepSeek@deepseek_ai · Mar 1

🚀 Day 6 of #OpenSourceWeek: One More Thing – DeepSeek-V3/R1 Inference System Overview Optimized throughput and latency via: 🔧 Cross-node EP-powered batch scaling 🔄 Computation-communication overlap ⚖️ Load balancing Statistics of DeepSeek's Online Service: ⚡ 73.7k/14.8k…

790

1.0K

9.0K

2.0K

3.9M

DeepSeek@deepseek_ai · Feb 28

🚀 Day 5 of #OpenSourceWeek: 3FS, Thruster for All DeepSeek Data Access Fire-Flyer File System (3FS) - a parallel file system that utilizes the full bandwidth of modern SSDs and RDMA networks. ⚡ 6.6 TiB/s aggregate read throughput in a 180-node cluster ⚡ 3.66 TiB/min…

531

1.0K

11.0K

3.0K

3.2M

DeepSeek@deepseek_ai · Feb 27

🚀 Day 4 of #OpenSourceWeek: Optimized Parallelism Strategies ✅ DualPipe - a bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training. 🔗 github.com/deepseek-ai/Du… ✅ EPLB - an expert-parallel load balancer for V3/R1. 🔗…

deepseek_ai's tweet card. A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training. - deepseek-ai/DualPipe

451

843

6.0K

830

2.5M

DeepSeek@deepseek_ai · Feb 26

🚨 Off-Peak Discounts Alert! Starting today, enjoy off-peak discounts on the DeepSeek API Platform from 16:30–00:30 UTC daily: 🔹 DeepSeek-V3 at 50% off 🔹 DeepSeek-R1 at a massive 75% off Maximize your resources smarter — save more during these high-value hours!

deepseek_ai's tweet image. 🚨 Off-Peak Discounts Alert!

Starting today, enjoy off-peak discounts on the DeepSeek API Platform from 16:30–00:30 UTC daily:

🔹 DeepSeek-V3 at 50% off
🔹 DeepSeek-R1 at a massive 75% off

Maximize your resources smarter — save more during these high-value hours!

544

709

7.0K

918

865.0K

DeepSeek@deepseek_ai · Feb 26

🚀 Day 3 of #OpenSourceWeek: DeepGEMM Introducing DeepGEMM - an FP8 GEMM library that supports both dense and MoE GEMMs, powering V3/R1 training and inference. ⚡ Up to 1350+ FP8 TFLOPS on Hopper GPUs ✅ No heavy dependency, as clean as a tutorial ✅ Fully Just-In-Time compiled…

472

1.0K

7.0K

903

943.0K

DeepSeek@deepseek_ai · Feb 25

🚀 Day 2 of #OpenSourceWeek: DeepEP Excited to introduce DeepEP - the first open-source EP communication library for MoE model training and inference. ✅ Efficient and optimized all-to-all communication ✅ Both intranode and internode support with NVLink and RDMA ✅…

519

1.0K

8.0K

1.0K

1.4M

DeepSeek@deepseek_ai · Feb 24

🚀 Day 1 of #OpenSourceWeek: FlashMLA Honored to share FlashMLA - our efficient MLA decoding kernel for Hopper GPUs, optimized for variable-length sequences and now in production. ✅ BF16 support ✅ Paged KV cache (block size 64) ⚡ 3000 GB/s memory-bound & 580 TFLOPS…

561

1.0K

11.0K

2.0K

1.7M

DeepSeek@deepseek_ai · Feb 21

🚀 Day 0: Warming up for #OpenSourceWeek! We're a tiny team @deepseek_ai exploring AGI. Starting next week, we'll be open-sourcing 5 repos, sharing our small but sincere progress with full transparency. These humble building blocks in our online service have been documented,…

1.0K

3.0K

21.0K

2.0K

2.5M

DeepSeek@deepseek_ai · Feb 18

🚀 Introducing NSA: A Hardware-Aligned and Natively Trainable Sparse Attention mechanism for ultra-fast long-context training & inference! Core components of NSA: • Dynamic hierarchical sparse strategy • Coarse-grained token compression • Fine-grained token selection 💡 With…

deepseek_ai's tweet image. 🚀 Introducing NSA: A Hardware-Aligned and Natively Trainable Sparse Attention mechanism for ultra-fast long-context training &amp; inference!

Core components of NSA:
• Dynamic hierarchical sparse strategy
• Coarse-grained token compression
• Fine-grained token selection

💡 With…

898

2.0K

16.0K

5.0K

2.5M

DeepSeek@deepseek_ai · Feb 14

🎉 Excited to see everyone’s enthusiasm for deploying DeepSeek-R1! Here are our recommended settings for the best experience: • No system prompt • Temperature: 0.6 • Official prompts for search & file upload: bit.ly/4hyH8np • Guidelines to mitigate model bypass…

701

2.0K

16.0K

7.0K

1.8M