Document

distributed 2021.10.0

Parameters BackRef

The SpecCluster class expects a full specification of the Scheduler and Workers to use. It removes any handling of user inputs (like threads vs processes, number of cores, and so on) and any handling of cluster resource managers (like pods, jobs, and so on). Instead, it expects this information to be passed in scheduler and worker specifications. This class does handle all of the logic around asynchronously cleanly setting up and tearing things down at the right times. Hopefully it can form a base for other more user-centric classes.

Parameters

workers: dict :: A dictionary mapping names to worker classes and their specifications See example below
scheduler: dict, optional :: A similar mapping for a scheduler
worker: dict :: A specification of a single worker. This is used for any new workers that are created.
asynchronous: bool :: If this is intended to be used directly within an event loop with async/await
silence_logs: bool :: Whether or not we should silence logging when setting up the cluster.
name: str, optional :: A name to use when printing out the cluster, defaults to type name

Cluster that requires a full specification of workers

Examples

To create a SpecCluster you specify how to set up a Scheduler and Workers

This example is valid syntax, but we were not able to check execution

>>> from dask.distributed import Scheduler, Worker, Nanny
... scheduler = {'cls': Scheduler, 'options': {"dashboard_address": ':8787'}}
... workers = {
...     'my-worker': {"cls": Worker, "options": {"nthreads": 1}},
...     'my-nanny': {"cls": Nanny, "options": {"nthreads": 2}},
... }
... cluster = SpecCluster(scheduler=scheduler, workers=workers)

The worker spec is stored as the .worker_spec attribute

This example is valid syntax, but we were not able to check execution

>>> cluster.worker_spec
{
   'my-worker': {"cls": Worker, "options": {"nthreads": 1}},
   'my-nanny': {"cls": Nanny, "options": {"nthreads": 2}},
}

While the instantiation of this spec is stored in the .workers attribute

This example is valid syntax, but we were not able to check execution

>>> cluster.workers
{
    'my-worker': <Worker ...>
    'my-nanny': <Nanny ...>
}

Should the spec change, we can await the cluster or call the ._correct_state method to align the actual state to the specified state.

We can also .scale(...) the cluster, which adds new workers of a given form.

This example is valid syntax, but we were not able to check execution

>>> worker = {'cls': Worker, 'options': {}}
... cluster = SpecCluster(scheduler=scheduler, worker=worker)
... cluster.worker_spec
{}

This example is valid syntax, but we were not able to check execution

>>> cluster.scale(3)
... cluster.worker_spec
{
    0: {'cls': Worker, 'options': {}},
    1: {'cls': Worker, 'options': {}},
    2: {'cls': Worker, 'options': {}},
}

Note that above we are using the standard Worker and Nanny classes, however in practice other classes could be used that handle resource management like KubernetesPod or SLURMJob . The spec does not need to conform to the expectations of the standard Dask Worker class. It just needs to be called with the provided options, support __await__ and close methods and the worker_address property..

Also note that uniformity of the specification is not required. Other API could be added externally (in subclasses) that adds workers of different specifications into the same dictionary.

If a single entry in the spec will generate multiple dask workers then please provide a :None:None:`"group"` element to the spec, that includes the suffixes that will be added to each name (this should be handled by your worker class).

This example is valid syntax, but we were not able to check execution

>>> cluster.worker_spec
{
    0: {"cls": MultiWorker, "options": {"processes": 3}, "group": ["-0", "-1", -2"]}
    1: {"cls": MultiWorker, "options": {"processes": 2}, "group": ["-0", "-1"]}
}

These suffixes should correspond to the names used by the workers when they deploy.

This example is valid syntax, but we were not able to check execution

>>> [ws.name for ws in cluster.scheduler.workers.values()]
["0-0", "0-1", "0-2", "1-0", "1-1"]

See :

Back References

The following pages refer to to this document either explicitly or contain code examples using this.

distributed.deploy.spec.SpecCluster

Local connectivity graph

Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.

Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)

SVG is more flexible but power hungry; and does not scale well to 50 + nodes.

All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them

File: /distributed/deploy/spec.py#119
type: <class 'type'>
Commit: