Coiled
@CoiledHQ
Scale Python with Dask
Easily configure shared memory size for CLI jobs with `--docker-shm-size`. Training PyTorch models on a GPU and need more memory? Ever run into "Error: No space left on device"? Customize Docker shared memory size with `--docker-shm-size`. docs.coiled.io/user_guide/cli…

🔨 Job setup option for Coiled Batch Use `--host-setup-script` to configure your VM before your batch job starts. Easily: ✅ Install dependencies ✅ Mount cloud storage ✅ Handle authentication or any other setup your jobs need. docs.coiled.io/user_guide/bat…
~5 years ago I worked at a startup where we had multiple engineers screwing around for months with Terraform, Kubernetes, EKS, etc. just to get the same capabilities I got after an hour of playing around w/ Coiled. Pretty cool.
Coiled 2024 in Review docs.coiled.io/blog/2024-eoy.… It’s the time when companies issue year-end summaries, acclaiming success (or not), and forecasting incredible growth for the next year (or not). I thought I’d do something similar for Coiled. It’s been quite a year for us ...
Calculating quantiles, a common application in #geospatial workloads, used to be slow due to GIL contention in NumPy. The new implementation in @dask_dev + @xarray_dev is up to a hundred times faster and scales independently of the number of threads 🥳. docs.coiled.io/blog/array-qua…
We're big fans of rich for a nice terminal experience, but have found sometimes folks log things even rich can't handle. In the latest coiled=1.67.0 release, coiled logs automatically falls back to non-rich printing in these situations. Release notes: docs.coiled.io/user_guide/cha…

New Post: Cloud Computing is Broken matthewrocklin.com/cloud-is-broke… Investor asks: "What's next for Data/Cloud Infrastructure?" My answer: "Boring stuff. People struggle with basics." Cloud feels like MP3 players before iPod. In theory everything is good. In practice adoption is low
We're now on Bluesky! Should be pretty easy to find us, since bluesky lets us use our coiled.io domain as our handle ☀️
Read about the latest improvement to GroupBy.map with Dask: xarray.dev/blog/dask-detr… Thanks to Patrick Hoefler of @CoiledHQ for the great work here!
New Post: SLURM-Style Job Arrays on the Cloud docs.coiled.io/blog/slurm-job… HPC Job scripts were the first form of parallelism I ever used as a graduate student. They're dead simple and accessible to almost anyone. We replicated the API with Coiled. It feels pretty slick to me 🙂
@CoiledHQ is amazing. If you want to have distributed compute and provisioned infrastructure from the code - its easy as that. Forget @ApacheSpark and @awscloud Sagemaker, EMR
New blog post: Scale AI-based Data Processing EASY The FineWeb-Edu dataset comes from processing 45TB (🤯) of FineWeb And it uses a Language Model to classify the educational level of the text 😭😭 Still, we reproduced it in a few lines of code ! The key ? HF + Dask 😎
Implemented @CoiledHQ into our product to offload data syncing from BigQuery to Neo4j 🤯 Works like butter 🧈 Now I don’t have to worry about scaling VMs dynamically to handle variable loads.
We're to build a 100-TB scale geospatial benchmark suite docs.coiled.io/blog/geospatia… We've seen an uptick in geospatial users and in challenges of the Xarray/Dask stack to scale beyond ~500-GiB. This post presents a call for benchmark workloads.
Arraylake and @CoiledHQ work great together! You can use Coiled to manage your cloud computing infrastructure with @dask_dev, and store your data as @zarr_dev in Arraylake. We just added new a documentation page about our integration with Coiled. docs.earthmover.io/integrations/c…
Run a Python script on a cloud GPU with one line of code. Training a @PyTorch model training takes ~10 minutes and cost ~$0.12 on the NVIDIA T4 GPU on AWS. Coiled handles provisioning hardware, setting up drivers, and installing CUDA-compiled PyTorch. docs.coiled.io/user_guide/gpu…

Dask DataFrame is now 20x faster. Some of most prominent changes include: - Apache Arrow support in @pandas_dev - Better shuffling algorithm for faster joins - Automatic query optimization Learn more: docs.coiled.io/blog/dask-data…
TPC-H Cloud Benchmarks: Spark, Dask, DuckDB, Polars Across scales: 10 GiB, 100 GiB, 1 TiB, 10 TiB Hardware: MBP and AWS It was a fun experiment. No project wins uniformly. DuckDB and Dask do pretty well. docs.coiled.io/blog/tpch.html
Recommendation of the day: `coiled notebook start` to run a remote Jupyter Lab from big machines in cloud but with file sync that feel "local". Demo from @CoiledHQ youtu.be/mibhDHYun0M #python #jupyter
On the 14th of May, @quantcotech Karlsruhe will host the next @PyDataSW Meetup. Florian Jetter will join us to talk about @dask_dev's impressive speed, and @pavelzw, Adrian, and @0xBE7A show how to manage hundreds of Python Sign up at meetup.com/pydata-suedwes…