ComputeContext#

class polars_cloud.ComputeContext(
*,
cpus: int | None = None,
memory: int | None = None,
instance_type: str | None = None,
storage: int | None = None,
cluster_size: int | None = None,
requirements: str | Path | io.IOBase | bytes | None = None,
connection_mode: ConnectionMode = 'direct',
workspace: str | UUID | Workspace | None = None,
labels: list[str] | str | None = None,
log_level: LogLevel | None = None,
idle_timeout_mins: int | None = None,
insecure: bool = False,
)

Compute context in which queries are executed.

The compute context is an abstraction of the underlying hardware (either a single node or multiple nodes in case of a cluster).

Parameters:
cpus

The number of CPUs each instance in the compute context should have access to. This acts as a lower bound, Polars Cloud finds the smallest available machine that satisfies both cpu and memory requirements.

memory

The amount of RAM (in GB) each instance in the compute context should have access to. This acts as a lower bound, see cpus.

instance_type

The instance type to use.

storage

The amount of disk space (in GB) each instance in the compute context should have access to.

cluster_size

The number of machines to spin up in the cluster. Defaults to 1. Includes the optional big worker instance.

requirements

Path to a file or a file-like object [1] containing dependencies to install in the compute context, in the requirements.txt format.

connection_mode

How the context will connect to the compute cluster. - direct: connect directly to the compute cluster. - proxy: send queries to the compute cluster via the control plane. Defaults to direct.

workspace

The workspace to run this context in. You may specify the name (str), the id (UUID) or the Workspace object. If you’re in multiple organizations that have a workspace with the same name then you need to explicitly specify the organization.

labels

Labels of the workspace (will be implicitly created)

log_level{‘info’, ‘debug’, ‘trace’}

Override the log level of the context for debug purposes.

idle_timeout_mins

How many minutes a cluster can be idle before it will be automatically killed. The minimum is 10 minutes, by default it is set to 1 hour.

Notes

Note

If the cpus, memory, and instance_type parameters are not set, the parameters are resolved with the default context specs of the workspace.

Footnotes

Examples

>>> ctx = pc.ComputeContext(
    workspace="workspace-name", cpus=24, memory=24, cluster_size=2, labels="docs"
)
>>> ctx
ComputeContext(
    id=None,
    cpus=24,
    memory=24,
    instance_type=None,
    storage=None,
    cluster_size=2,
    mode="direct",
    workspace_name="workspace-name",
    labels=["docs"],
    log_level=LogLevelSchema.Info,
)

The ComputeContext can also be used as a decorator or context manager, which will set it as the default context within its scope, automatically starting and stopping the underlying compute:

@pc.ComputeContext(workspace="workspace-name", cpus=96, memory=256)
def run_queries():
    ...
    query1.remote().execute()
    query2.remote().execute()

# or

with pc.ComputeContext(workspace="workspace-name", cpus=96, memory=256):
    ...
    query1.remote().execute()
    query2.remote().execute()

Methods:

get_status

Get the status of the compute context.

start

Start the compute context.

stop

Stop the compute context.

list

List all compute contexts in the workspace and the current status for each.

connect

Reconnect with an already running compute context by id.

select

Connect to existing compute context interactively.

Attributes:

cpus

The number of CPUs each instance has access to.

memory

The amount of RAM (in GB) each instance has access to.

instance_type

The instance type of the compute context.

storage

The amount of disk space (in GB) each instance has access to.

cluster_size

The number of compute nodes in the context.

connection_mode

In what way to connect to the compute cluster.

labels

The labels of the compute context.

organization

The organization to run the compute context in.

workspace

The workspace to run the compute context in.

get_status() ComputeContextStatus

Get the status of the compute context.

Examples

>>> ctx = pc.ComputeContext(workspace="workspace-name", cpus=24, memory=24)
>>> ctx.get_status()
UNINITIALIZED
start(*, wait: bool = False) None

Start the compute context.

This boots up the underlying node(s) of the compute context.

Parameters:
wait

Wait for the context to be ready before returning. If the ComputeContext is in direct connection mode, it will always wait until ready.

Examples

>>> ctx = pc.ComputeContext(workspace="workspace-name", cpus=24, memory=24)
>>> ctx.start()
stop(*, wait: bool = False) None

Stop the compute context.

Parameters:
wait

If True, this will block this thread until context is stopped.

Examples

>>> ctx = pc.ComputeContext(workspace="workspace-name", cpus=24, memory=24)
>>> ctx.stop()
classmethod list(
workspace: Workspace | UUID | str,
) list[tuple[Self, ComputeContextStatus]]

List all compute contexts in the workspace and the current status for each.

Parameters:
workspace

Name or ID of the workspace the compute context lives in

Examples

>>> pc.ComputeContext.list(workspace="YourWorkspace")
[(ComputeContext(...), ComputeContextStatus)],
[(ComputeContext(...), ComputeContextStatus)],
[(ComputeContext(...), ComputeContextStatus)],
classmethod connect(
compute_id: str | UUID,
workspace: str | UUID | Workspace | None = None,
) Self

Reconnect with an already running compute context by id.

Parameters:
workspace

The workspace in which the compute context lives

compute_id

The unique identifier of the existing compute context.

Examples

>>> ctx = pc.ComputeContext.connect(
...     workspace="WorkspaceName",
...     compute_id="xxxxxxxx-1860-7521-829d-40444726cbca",
... )
classmethod select(workspace: str | UUID | Workspace | None = None) Self | None

Connect to existing compute context interactively.

Parameters:
workspace

The workspace in which the compute context lives

Examples

>>> ctx = pc.ComputeContext.select(
...     workspace="WorkspaceName",
... )
property cpus: int | None

The number of CPUs each instance has access to.

property memory: int | None

The amount of RAM (in GB) each instance has access to.

property instance_type: str | None

The instance type of the compute context.

property storage: int | None

The amount of disk space (in GB) each instance has access to.

property cluster_size: int

The number of compute nodes in the context.

property connection_mode: str

In what way to connect to the compute cluster.

property labels: list[str] | None

The labels of the compute context.

property organization: Organization

The organization to run the compute context in.

property workspace: Workspace

The workspace to run the compute context in.