Alex Cheema - e/acc (@alexocheema)

Pinned

A

Alex Cheema - e/acc@alexocheema · Sep 13

The future of AI is open source and decentralized.

AAI at Meta@AIatMeta · Sep 13

"Exo's use of Llama 405B and consumer-grade devices to run inference at scale on the edge shows that the future of AI is open source and decentralized." - @mo_baioumy x.com/ac_crypto/stat…

12

59

605

96

192.0K

A

Alex Cheema - e/acc@alexocheema · Jul 21

Apple has more FLOPS than NVIDIA

AAnemll@anemll · Jul 21

If every post-2020 Apple device lit up its Neural Engine at once, humanity would have ~20 zettainteger-ops-per-second of on-device AI oomph—about five times the cumulative floating-point tensor capacity of all NVIDIA GPUs sold in the same period. In practice, Nvidia’s…

2

6

73

8

7.0K

A

Alex Cheema - e/acc@alexocheema · Jul 21

‘The number of Macs that can train together coherently doubles every 2 months’; I'll call this 'Cheema's Law'. And it might sound a joke, but it has been remarkable how much progress we've made on this problem in such a short time. When you're working in a space that is mostly…

EEXO Labs@exolabs · Jul 18

We're doubling the number of Apple Silicon macs that can train together coherently every 2 months. Our new KPOP optimizer was designed specifically for the hardware constraints of Apple Silicon and implemented using mlx.distributed.

8

19

141

34

11.0K

A

Alex Cheema - e/acc@alexocheema · Jul 18

introducing KPOP - a novel optimiser designed to leverage the massive 512GB RAM on the latest-gen M3 Ultra Mac Studios. matches AdamW performance by using significantly larger batch sizes. all on consumer hardware. catch @MattBeton and @tychovdo presenting this at ICML

EEXO Labs@exolabs · Jul 18

Paper is out. Link: openreview.net/pdf?id=TJjP8d5…

0

4

20

5

3.0K

A

Alex Cheema - e/acc@alexocheema · Jul 18

We're doubling the number of Apple Silicon macs that can train together coherently every 2 months. Our new KPOP optimizer was designed specifically for the hardware constraints of Apple Silicon and implemented using mlx.distributed.

AAwni Hannun@awnihannun · Jul 18

New research from Exo done (in part) with MLX on Apple silicon: An algorithm for distributed training that leverages higher RAM capacity of Apple silicon relative to FLOPs and inter-machine bandwidth.

4

19

178

33

23.0K

A

Alex Cheema - e/acc@alexocheema · Jul 18

New research from Exo done (in part) with MLX on Apple silicon: An algorithm for distributed training that leverages higher RAM capacity of Apple silicon relative to FLOPs and inter-machine bandwidth.

EEXO Labs@exolabs · Jul 18

Paper is out. Link: openreview.net/pdf?id=TJjP8d5…

16

33

380

107

45.0K

A

Alex Cheema - e/acc@alexocheema · Jul 18

EXO 💛 MLX

AAwni Hannun@awnihannun · Jul 18

New research from Exo done (in part) with MLX on Apple silicon: An algorithm for distributed training that leverages higher RAM capacity of Apple silicon relative to FLOPs and inter-machine bandwidth.

1

3

47

4

4.0K

A

Alex Cheema - e/acc@alexocheema · Jul 18

Paper is out. Link: openreview.net/pdf?id=TJjP8d5…

AAlex Cheema - e/acc@alexocheema · Jul 18

KPOP is a new DL optimizer designed for large scale distributed training on Apple Silicon. KPOP uses a lot more memory but is more efficient per FLOP than AdamW, so it's a better fit for hardware with a high memory:flops ratio. Some hardware numbers: H100: 80GB, 1000TFLOPS…

5

56

361

155

61.0K

A

Alex Cheema - e/acc@alexocheema · Jul 16

im in Vancouver this week for ICML. let's grab a coffee if you're interested in what we're doing @exolabs or want to chat about distributed training / inference or on-device AI, dm's open

AAlex Cheema - e/acc@alexocheema · Jul 11

A new approach to efficient large scale distributed training on Apple Silicon. Most AI research today is focused on traditional GPUs. These GPUs have a LOT of FLOPS but not much memory. They have a low memory:flops ratio. Apple Silicon has a lot more memory available for the GPU…

0

20

1

4.0K

Alex Cheema - e/acc Retweeted

O

Overdraft@OverdraftDeFi · Jul 15

💸OVERDRAFT - the first fiat DEX. Swap fiat <> crypto in seconds. No custody, no fees, no fund freeze. Beta now live @HyperLiquidX

61

77

584

295

97.0K

A

Alex Cheema - e/acc@alexocheema · Jul 11

EXO isn't just for inference.

AAlex Cheema - e/acc@alexocheema · Jul 11

A new approach to efficient large scale distributed training on Apple Silicon. Most AI research today is focused on traditional GPUs. These GPUs have a LOT of FLOPS but not much memory. They have a low memory:flops ratio. Apple Silicon has a lot more memory available for the GPU…

4

9

79

13

7.0K

A

Alex Cheema - e/acc@alexocheema · Jul 11

A new approach to efficient large scale distributed training on Apple Silicon. Most AI research today is focused on traditional GPUs. These GPUs have a LOT of FLOPS but not much memory. They have a low memory:flops ratio. Apple Silicon has a lot more memory available for the GPU…

MMatt Beton@MattBeton · Jul 11

I’m going to be in Vancouver next week for ICML! Would love to meet anyone involved with distributed training, infrastructure, inference engines, open source AI. I'll be presenting two papers: - EXO Gym - an open source framework for simulating distributed training algorithms…

12

37

259

125

34.0K

A

Alex Cheema - e/acc@alexocheema · Jul 10

if they ever tell my story, let them say I walked with giants; men rise and fall like the winter wheat, but these names will never die.

HHomeDAO 🏴‍☠️@HomeDAO_live · Mar 24

HomeDAO Cohort 1 produced $6bn worth of companies. Apply to be part of HomeDAO Cohort 2 now. Deadline - 15th of July.

2

5

27

6

4.0K

A

Alex Cheema - e/acc@alexocheema · Jul 9

pump is one of the fastest growing startup ever. 0 to $1B ARR in 9 months. 25% of revenue for $PUMP buy backs is insane, i'm predicting this ends up in the top 10.

ppump.fun@pumpdotfun · Jul 9

the moment you’ve all been waiting for $PUMP is launching through an Initial Coin Offering on Saturday, July 12th. airdrop coming soon. our plan is to Kill Facebook, TikTok, and Twitch. On Solana. learn more about $PUMP and how to get involved 👇

8

0

17

8

5.0K

A

Alex Cheema - e/acc@alexocheema · Jul 3

We’re already doing this with @exolabs Last month was the first trial, we provided free M-chip public cloud access to developers at a hackathon. These were M3 Max/Ultra Mac Studios with up to 512GB unified memory. @awnihannun gave a talk at the hackathon on how to leverage MLX

NNIK@ns123abc · Jul 3

NEWS: Apple is considering turning its M chips into a public cloud for developers

15

17

210

32

17.0K

A

Alex Cheema - e/acc@alexocheema · Jul 3

He applied to @exolabs last year

SSuhail@Suhail · Jul 2

PSA: there’s a guy named Soham Parekh (in India) who works at 3-4 startups at the same time. He’s been preying on YC companies and more. Beware. I fired this guy in his first week and told him to stop lying / scamming people. He hasn’t stopped a year later. No more excuses.

5

0

55

13

7.0K