Matthew Rocklin

Matthew is an open source software developer in the numeric Python ecosystem. He maintains several PyData libraries, but today focuses mostly on Dask a library for scalable computing. Matthew worked for Anaconda Inc for several years, then built out the Dask team at NVIDIA for RAPIDS, and most recently founded Coiled to improve Python's scalability with Dask for large organizations.

Matthew holds a bachelors degree from UC Berkeley in physics and mathematics, and a PhD in computer science from the University of Chicago.

The speaker's profile picture

Sessions

11-03
13:30
40min
Spark, Dask, DuckDB, and Polars: Benchmarks
Matthew Rocklin

Spark, Dask, DuckDB, and Polars are popular dataframe tools used at large scale. We benchmark these across a variety of scales (10 GiB to 10 TiB) on both local and cloud architectures with the standard TPC-H benchmark. No project emerges unscathed.

Winter Garden (Room 5412)