ComputeContext#
- class polars_cloud.ComputeContext(
- *,
- name: str | None = None,
- cpus: int | None = None,
- memory: int | None = None,
- instance_type: str | None = None,
- storage: int | None = None,
- cluster_size: int | None = None,
- requirements: str | Path | io.IOBase | bytes | None = None,
- connection_mode: ConnectionMode | None = None,
- workspace: str | UUID | Workspace | None = None,
- labels: list[str] | str | None = None,
- log_level: LogLevel | None = None,
- idle_timeout_mins: int | None = None,
- insecure: bool = False,
Compute context in which queries are executed.
The compute context is an abstraction of the underlying hardware (either a single node or multiple nodes in case of a cluster).
- Parameters:
- name
The name of the registered ComputeContext to start, or connect to if already running.
- cpus
The number of CPUs each instance in the compute context should have access to. This acts as a lower bound, Polars Cloud finds the smallest available machine that satisfies both cpu and memory requirements.
- memory
The amount of RAM (in GB) each instance in the compute context should have access to. This acts as a lower bound, see
cpus.- instance_type
The instance type to use.
- storage
The amount of disk space (in GB) each instance in the compute context should have access to.
- cluster_size
The number of machines to spin up in the cluster. Defaults to
1. Includes the optional big worker instance.- requirements
Path to a file or a file-like object [1] containing dependencies to install in the compute context, in the requirements.txt format.
- connection_mode
How the context will connect to the compute cluster. - direct: connect directly to the compute cluster. - proxy: send queries to the compute cluster via the control plane. Defaults to
direct.- workspace
The workspace to run this context in. You may specify the name (str), the id (UUID) or the Workspace object. If you’re in multiple organizations that have a workspace with the same name then you need to explicitly specify the organization.
- labels
Labels of the workspace (will be implicitly created)
- log_level{‘info’, ‘debug’, ‘trace’}
Override the log level of the context for debug purposes.
- idle_timeout_mins
How many minutes a cluster can be idle before it will be automatically killed. The minimum is 10 minutes, by default it is set to 1 hour.
Notes
Note
If the
cpus,memory, andinstance_typeparameters are not set, the parameters are resolved with the default context specs of the workspace.Footnotes
Examples
>>> ctx = pc.ComputeContext( workspace="workspace-name", cpus=24, memory=24, cluster_size=2, labels="docs" ) >>> ctx ComputeContext( id=None, cpus=24, memory=24, instance_type=None, storage=None, cluster_size=2, mode="direct", workspace_name="workspace-name", labels=["docs"], log_level=LogLevelSchema.Info, ) >>> ctx.register("compute-name") >>> pc.ComputeContext(workspace="workspace-name", name="compute-name") ComputeContext( name="cluster-name", workspace_name="workspace-name", id=None, cpus=24, memory=24, instance_type=None, storage=None, cluster_size=2, mode="direct", labels=["docs"], log_level=LogLevelSchema.Info, )
The ComputeContext can also be used as a decorator or context manager, which will set it as the default context within its scope, automatically starting and stopping the underlying compute:
@pc.ComputeContext(workspace="workspace-name", cpus=96, memory=256) def run_queries(): ... query1.remote().execute() query2.remote().execute() # or with pc.ComputeContext(workspace="workspace-name", cpus=96, memory=256): ... query1.remote().execute() query2.remote().execute()
Methods:
get_statusGet the status of the compute context.
registerRegister the compute cluster specs under the given name.
startStart the compute context.
stopStop the compute context.
listList all compute contexts in the workspace and the current status for each.
connectReconnect with an already running compute context by id.
selectConnect to existing compute context interactively.
Attributes:
cpusThe number of CPUs each instance has access to.
memoryThe amount of RAM (in GB) each instance has access to.
instance_typeThe instance type of the compute context.
storageThe amount of disk space (in GB) each instance has access to.
cluster_sizeThe number of compute nodes in the context.
connection_modeIn what way to connect to the compute cluster.
labelsThe labels of the compute context.
organizationThe organization to run the compute context in.
workspaceThe workspace to run the compute context in.
- get_status() ComputeContextStatus
Get the status of the compute context.
Examples
>>> ctx = pc.ComputeContext(workspace="workspace-name", cpus=24, memory=24) >>> ctx.get_status() UNINITIALIZED
- register(name: str) None
Register the compute cluster specs under the given name.
This does not start the compute cluster, instead allows the cluster to be started with this name in future.
- Parameters:
- name
Name to register the compute cluster under.
Examples
>>> ctx = pc.ComputeContext(workspace="workspace-name", cpus=24, memory=24) >>> ctx.register("my-cluster") >>> ctx.start() >>> ctx = pc.ComputeContext(workspace="workspace-name", name="my-cluster") >>> ctx.start()
- start(*, wait: bool = False) None
Start the compute context.
This boots up the underlying node(s) of the compute context.
- Parameters:
- wait
Wait for the compute context to be ready before returning. If the
ComputeContextis in direct connection mode, it will always wait until ready.
Examples
>>> ctx = pc.ComputeContext(workspace="workspace-name", cpus=24, memory=24) >>> ctx.start()
- stop(*, wait: bool = False) None
Stop the compute context.
- Parameters:
- wait
If True, this will block this thread until context is stopped.
Examples
>>> ctx = pc.ComputeContext(workspace="workspace-name", cpus=24, memory=24) >>> ctx.stop()
- classmethod list(
- workspace: Workspace | UUID | str,
List all compute contexts in the workspace and the current status for each.
- Parameters:
- workspace
Name or ID of the workspace the compute context lives in
Examples
>>> pc.ComputeContext.list(workspace="YourWorkspace") [(ComputeContext(...), ComputeContextStatus)], [(ComputeContext(...), ComputeContextStatus)], [(ComputeContext(...), ComputeContextStatus)],
- classmethod connect( ) Self
Reconnect with an already running compute context by id.
- Parameters:
- workspace
The workspace in which the compute context lives
- compute_id
The unique identifier of the existing compute context.
Examples
>>> ctx = pc.ComputeContext.connect( ... workspace="WorkspaceName", ... compute_id="xxxxxxxx-1860-7521-829d-40444726cbca", ... )
- classmethod select(workspace: str | UUID | Workspace | None = None) Self | None
Connect to existing compute context interactively.
- Parameters:
- workspace
The workspace in which the compute context lives
Examples
>>> ctx = pc.ComputeContext.select( ... workspace="WorkspaceName", ... )
- property cluster_size: int
The number of compute nodes in the context.
- property connection_mode: str
In what way to connect to the compute cluster.
- property organization: Organization
The organization to run the compute context in.
- property workspace: Workspace
The workspace to run the compute context in.