distributed 2021.10.0

BackRef

The scheduler tracks the current state of workers, data, and computations. The scheduler listens for events and responds by controlling workers appropriately. It continuously tries to use the workers to execute an ever growing dask graph.

All events are handled quickly, in linear time with respect to their input (which is often of constant size) and generally within a millisecond. To accomplish this the scheduler tracks a lot of state. Every operation maintains the consistency of this state.

The scheduler communicates with the outside world through Comm objects. It maintains a consistent and valid view of the world even when listening to several clients at once.

A Scheduler is typically started either with the dask-scheduler executable:

$ dask-scheduler
Scheduler started at 127.0.0.1:8786

Or within a LocalCluster a Client starts up without connection information:

>>> c = Client()  # doctest: +SKIP
>>> c.cluster.scheduler  # doctest: +SKIP
Scheduler(...)

Users typically do not interact with the scheduler directly but rather with the client object Client .

State

The scheduler contains the following state variables. Each variable is listed along with what it stores and a brief description.

Dynamic distributed task scheduler

Examples

See :

Back References

The following pages refer to to this document either explicitly or contain code examples using this.

distributed.deploy.spec.SpecCluster distributed.client.Client

Local connectivity graph

Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.

Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)

SVG is more flexible but power hungry; and does not scale well to 50 + nodes.

All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them


File: /distributed/scheduler.py#3501
type: <class 'type'>
Commit: