ComputeContext#

Compute context in which queries are executed.

The compute context is an abstraction of the underlying hardware (either a single node or multiple nodes in case of a cluster).

Parameters:

name: The name of the registered ComputeContext to start, or connect to if already running.
cpus: The number of CPUs each instance in the compute context should have access to. This acts as a lower bound, Polars Cloud finds the smallest available machine that satisfies both cpu and memory requirements.
memory: The amount of RAM (in GB) each instance in the compute context should have access to. This acts as a lower bound, see cpus.
instance_type: The instance type to use.
storage: The amount of disk space (in GB) each instance in the compute context should have access to.
cluster_size: The number of machines to spin up in the cluster. Defaults to 1. Includes the optional big worker instance.
requirements: Path to a file or a file-like object [1] containing dependencies to install in the compute context, in the requirements.txt format.
connection_mode: How the context will connect to the compute cluster. - direct: connect directly to the compute cluster. - proxy: send queries to the compute cluster via the control plane. Defaults to direct.
workspace: The workspace to run this context in. You may specify the name (str), the id (UUID) or the Workspace object. If you’re in multiple organizations that have a workspace with the same name then you need to explicitly specify the organization.
labels: Labels of the workspace (will be implicitly created)
log_level{‘info’, ‘debug’, ‘trace’}: Override the log level of the context for debug purposes.
idle_timeout_mins: How many minutes a cluster can be idle before it will be automatically killed. The minimum is 10 minutes, by default it is set to 1 hour.

Notes

Note

If the cpus, memory, and instance_type parameters are not set, the parameters are resolved with the default context specs of the workspace.

Footnotes

Examples

>>> ctx = pc.ComputeContext(
    workspace="workspace-name", cpus=24, memory=24, cluster_size=2, labels="docs"
)
>>> ctx
ComputeContext(
    id=None,
    cpus=24,
    memory=24,
    instance_type=None,
    storage=None,
    cluster_size=2,
    mode="direct",
    workspace_name="workspace-name",
    labels=["docs"],
    log_level=LogLevelSchema.Info,
)
>>> ctx.register("compute-name")
>>> pc.ComputeContext(workspace="workspace-name", name="compute-name")
ComputeContext(
    name="cluster-name",
    workspace_name="workspace-name",
    id=None,
    cpus=24,
    memory=24,
    instance_type=None,
    storage=None,
    cluster_size=2,
    mode="direct",
    labels=["docs"],
    log_level=LogLevelSchema.Info,
)

The ComputeContext can also be used as a decorator or context manager, which will set it as the default context within its scope, automatically starting and stopping the underlying compute:

@pc.ComputeContext(workspace="workspace-name", cpus=96, memory=256)
def run_queries():
    ...
    query1.remote().execute()
    query2.remote().execute()

# or

with pc.ComputeContext(workspace="workspace-name", cpus=96, memory=256):
    ...
    query1.remote().execute()
    query2.remote().execute()

Methods:

`get_status`	Get the status of the compute context.
`register`	Register the compute cluster specs under the given name.
`start`	Start the compute context.
`stop`	Stop the compute context.
`list`	List all compute contexts in the workspace and the current status for each.
`connect`	Reconnect with an already running compute context by id.
`select`	Connect to existing compute context interactively.

Attributes:

`cpus`	The number of CPUs each instance has access to.
`memory`	The amount of RAM (in GB) each instance has access to.
`instance_type`	The instance type of the compute context.
`storage`	The amount of disk space (in GB) each instance has access to.
`cluster_size`	The number of compute nodes in the context.
`connection_mode`	In what way to connect to the compute cluster.
`labels`	The labels of the compute context.
`organization`	The organization to run the compute context in.
`workspace`	The workspace to run the compute context in.

get_status() → ComputeContextStatus

Get the status of the compute context.

Examples

>>> ctx = pc.ComputeContext(workspace="workspace-name", cpus=24, memory=24)
>>> ctx.get_status()
UNINITIALIZED

register(name: str) → None

Register the compute cluster specs under the given name.

This does not start the compute cluster, instead allows the cluster to be started with this name in future.

Parameters:

name: Name to register the compute cluster under.

Examples

>>> ctx = pc.ComputeContext(workspace="workspace-name", cpus=24, memory=24)
>>> ctx.register("my-cluster")
>>> ctx.start()
>>> ctx = pc.ComputeContext(workspace="workspace-name", name="my-cluster")
>>> ctx.start()

start(*, wait: bool = False) → None

Start the compute context.

This boots up the underlying node(s) of the compute context.

Parameters:

wait: Wait for the compute context to be ready before returning. If the ComputeContext is in direct connection mode, it will always wait until ready.

Examples

>>> ctx = pc.ComputeContext(workspace="workspace-name", cpus=24, memory=24)
>>> ctx.start()

stop(*, wait: bool = False) → None

Stop the compute context.

Parameters:

wait: If True, this will block this thread until context is stopped.

Examples

>>> ctx = pc.ComputeContext(workspace="workspace-name", cpus=24, memory=24)
>>> ctx.stop()

classmethod list( workspace: Workspace | UUID | str, ) → list[tuple[Self, ComputeContextStatus]]

List all compute contexts in the workspace and the current status for each.

Parameters:

workspace: Name or ID of the workspace the compute context lives in

Examples

>>> pc.ComputeContext.list(workspace="YourWorkspace")
[(ComputeContext(...), ComputeContextStatus)],
[(ComputeContext(...), ComputeContextStatus)],
[(ComputeContext(...), ComputeContextStatus)],

classmethod connect( compute_id: str | UUID, workspace: str | UUID | Workspace | None = None, ) → Self

Reconnect with an already running compute context by id.

Parameters:

workspace: The workspace in which the compute context lives
compute_id: The unique identifier of the existing compute context.

Examples

>>> ctx = pc.ComputeContext.connect(
...     workspace="WorkspaceName",
...     compute_id="xxxxxxxx-1860-7521-829d-40444726cbca",
... )

classmethod select(workspace: str | UUID | Workspace | None = None) → Self | None

Connect to existing compute context interactively.

Parameters:

workspace: The workspace in which the compute context lives

Examples

>>> ctx = pc.ComputeContext.select(
...     workspace="WorkspaceName",
... )

property cpus: int | None: The number of CPUs each instance has access to.

property memory: int | None: The amount of RAM (in GB) each instance has access to.

property instance_type: str | None: The instance type of the compute context.

property storage: int | None: The amount of disk space (in GB) each instance has access to.

property cluster_size: int: The number of compute nodes in the context.

property connection_mode: str: In what way to connect to the compute cluster.

property labels: list[str] | None: The labels of the compute context.

property organization: Organization: The organization to run the compute context in.

property workspace: Workspace: The workspace to run the compute context in.