polars_cloud.LazyFrameRemote.partition_by#
- LazyFrameRemote.partition_by(
- key: str | list[str],
- *,
- shuffle_compression: ShuffleCompression = 'auto',
- shuffle_format: ShuffleFormat = 'auto',
Partition this query by the given key.
This first partitions the data by the key and then runs this query per unique key. This will lead to
N
output results, whereN
is equal to the number of unique values inkey
This will run on multiple workers.
- Parameters:
- key
Key/keys to partition over.
- shuffle_compression{‘auto’, ‘lz4’, ‘zstd’, ‘uncompressed’}
Compress files before shuffling them. Compression reduces disk and network IO, but disables memory mapping. Choose “zstd” for good compression performance. Choose “lz4” for fast compression/decompression. Choose “uncompressed” for memory mapped access at the expense of file size.
- shuffle_format{‘auto’, ‘ipc’, ‘parquet’}
File format to use for shuffles.