EXO Labs (@exolabs)

Pinned

E

EXO Labs@exolabs · Sep 13

The future of AI is open source and decentralized

AAI at Meta@AIatMeta · Sep 13

"Exo's use of Llama 405B and consumer-grade devices to run inference at scale on the edge shows that the future of AI is open source and decentralized." - @mo_baioumy x.com/ac_crypto/stat…

21

93

673

203

268.0K

EXO Labs Retweeted

A

Alex Cheema - e/acc@alexocheema · Jul 21

SPARTA

2

4

43

7

5.0K

E

EXO Labs@exolabs · Jul 21

‘The number of Macs that can train together coherently doubles every 2 months’; I'll call this 'Cheema's Law'. And it might sound a joke, but it has been remarkable how much progress we've made on this problem in such a short time. When you're working in a space that is mostly…

EEXO Labs@exolabs · Jul 18

We're doubling the number of Apple Silicon macs that can train together coherently every 2 months. Our new KPOP optimizer was designed specifically for the hardware constraints of Apple Silicon and implemented using mlx.distributed.

8

19

141

34

11.0K

E

EXO Labs@exolabs · Jul 18

We're doubling the number of Apple Silicon macs that can train together coherently every 2 months. Our new KPOP optimizer was designed specifically for the hardware constraints of Apple Silicon and implemented using mlx.distributed.

AAwni Hannun@awnihannun · Jul 18

New research from Exo done (in part) with MLX on Apple silicon: An algorithm for distributed training that leverages higher RAM capacity of Apple silicon relative to FLOPs and inter-machine bandwidth.

3

19

178

33

23.0K

E

EXO Labs@exolabs · Jul 18

EXO 💛 MLX

AAwni Hannun@awnihannun · Jul 18

New research from Exo done (in part) with MLX on Apple silicon: An algorithm for distributed training that leverages higher RAM capacity of Apple silicon relative to FLOPs and inter-machine bandwidth.

1

3

47

4

4.0K

E

EXO Labs@exolabs · Jul 18

New research from Exo done (in part) with MLX on Apple silicon: An algorithm for distributed training that leverages higher RAM capacity of Apple silicon relative to FLOPs and inter-machine bandwidth.

EEXO Labs@exolabs · Jul 18

Paper is out. Link: openreview.net/pdf?id=TJjP8d5…

16

33

380

107

45.0K

E

EXO Labs@exolabs · Jul 18

Paper is out. Link: openreview.net/pdf?id=TJjP8d5…

AAlex Cheema - e/acc@alexocheema · Jul 18

KPOP is a new DL optimizer designed for large scale distributed training on Apple Silicon. KPOP uses a lot more memory but is more efficient per FLOP than AdamW, so it's a better fit for hardware with a high memory:flops ratio. Some hardware numbers: H100: 80GB, 1000TFLOPS…

5

56

360

156

61.0K

E

EXO Labs@exolabs · Jul 18

This past spring, I spent time with the @exolabs team to work on a new DL optimizer and wiring up clusters of Macs for distributed TRAINING on Apple Silicon. If you’re at ICML, be sure to come by the @ESFoMo workshop (posters 1-2:30pm) this Saturday. I’ll be there to share some…

MMatt Beton@MattBeton · Jul 11

I’m going to be in Vancouver next week for ICML! Would love to meet anyone involved with distributed training, infrastructure, inference engines, open source AI. I'll be presenting two papers: - EXO Gym - an open source framework for simulating distributed training algorithms…

4

13

106

24

30.0K

E

EXO Labs@exolabs · Jul 11

EXO isn't just for inference.

AAlex Cheema - e/acc@alexocheema · Jul 11

A new approach to efficient large scale distributed training on Apple Silicon. Most AI research today is focused on traditional GPUs. These GPUs have a LOT of FLOPS but not much memory. They have a low memory:flops ratio. Apple Silicon has a lot more memory available for the GPU…

4

9

79

13

7.0K

E

EXO Labs@exolabs · Jul 4

Apple considered building its own AWS competitor, spending resources evaluating the complexity and cost of cloud infrastructure. Meanwhile, @exolabs has already proved the power of distributed in-house hardware, turning idle Macs into an active, decentralized data center.

99to5Mac@9to5mac · Jul 3

Report: Apple looked into building its own AWS competitor 9to5mac.com/2025/07/03/rep… by @mvcmendes

9

4

43

11

9.0K

E

EXO Labs@exolabs · Jun 19

Mac Minis are the ultimate store of value.

AAlex Cheema - e/acc@alexocheema · Jun 19

"Mac Minis for example are a very good fit" - @karpathy @karpathy shouted out my work on @exolabs in his keynote at @ycombinator AI SUS! Here's the breakdown: Right now most AI workloads run in the cloud where requests from different users are continuously batched together.…

7

5

54

12

15.0K

E

EXO Labs@exolabs · Jun 18

Cost to run DeepSeek R1 (fp8) on Apple Silicon: $20,000 Cost to run DeepSeek R1 (fp8) on H100s: $300,000

AAlex Cheema - e/acc@alexocheema · Jun 18

.@karpathy shouted out my work on @exolabs at @ycombinator AI SUS! “we use LLMs similarly to mainframes in the ‘70s - compute is timeshared by having a slice in the batch dimension. models will compress over time, and with this we’ll be able to run more on-device”

16

47

675

160

57.0K

E

EXO Labs@exolabs · May 6

How long before this gets into the training data?

AAlex Cheema - e/acc@alexocheema · May 6

Just got sent these news articles reporting that @exolabs is founded by @karpathy. wtf??

4

1

41

5

11.0K