distributed 2021.10.0

ParametersReturnsBackRef
persist(self, collections, optimize_graph=True, workers=None, allow_other_workers=None, resources=None, retries=None, priority=0, fifo_timeout='60s', actors=None, **kwargs)

Starts computation of the collection on the cluster in the background. Provides a new dask collection that is semantically identical to the previous one, but now based off of futures currently in execution.

Parameters

collections : sequence or single dask object

Collections like dask.array or dataframe or dask.value objects

optimize_graph : bool

Whether or not to optimize the underlying graphs

workers : string or iterable of strings

A set of worker hostnames on which computations may be performed. Leave empty to default to all workers (common case)

allow_other_workers : bool (defaults to False)

Used with :None:None:`workers`. Indicates whether or not the computations may be performed on workers that are not in the :None:None:`workers` set(s).

retries : int (default to 0)

Number of allowed automatic retries if computing a result fails

priority : Number

Optional prioritization of task. Zero is default. Higher priorities take precedence

fifo_timeout : timedelta str (defaults to '60s')

Allowed amount of time between calls to consider the same priority

resources : dict (defaults to {})

Defines the :None:None:`resources` each instance of this mapped task requires on the worker; e.g. {'GPU': 2} . See worker resources <resources> for details on defining resources.

actors : bool or dict (default None)

Whether these tasks should exist on the worker as stateful actors. Specified on a global (True/False) or per-task ( {'x': True, 'y': False} ) basis. See actors for additional details.

**kwargs :

Options to pass to the graph optimize calls

Returns

List of collections, or single collection, depending on type of input.

Persist dask collections on cluster

See Also

Client.compute

Examples

This example is valid syntax, but we were not able to check execution
>>> xx = client.persist(x)  # doctest: +SKIP
... xx, yy = client.persist([x, y]) # doctest: +SKIP
See :

Back References

The following pages refer to to this document either explicitly or contain code examples using this.

distributed.client.Client.normalize_collection distributed.client.Client.publish_dataset

Local connectivity graph

Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.

Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)

SVG is more flexible but power hungry; and does not scale well to 50 + nodes.

All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them


File: /distributed/client.py#2907
type: <class 'function'>
Commit: