reduction(x, chunk, aggregate, axis=None, keepdims=False, dtype=None, split_every=None, combine=None, name=None, out=None, concatenate=True, output_size=1, meta=None)
Data being reduced along one or more axes
First function to be executed when resolving the dask graph. This function is applied in parallel to all original chunks of x. See below for function parameters.
Function used for intermediate recursive aggregation (see split_every below). If omitted, it defaults to aggregate. If the reduction can be performed in less than 3 steps, it will not be invoked at all.
Last function to be executed when resolving the dask graph, producing the final output. It is always invoked, even when the reduced Array counts a single chunk along the reduced axes.
Axis or axes to aggregate upon. If omitted, aggregate along all axes.
Whether the reduction function should preserve the reduced axes, leaving them at size output_size
, or remove them.
data type of output. This argument was previously optional, but leaving as None
will now raise an exception.
Determines the depth of the recursive aggregation. If set to or more than the number of input chunks, the aggregation will be performed in two steps, one chunk
function per input chunk and a single aggregate
function at the end. If set to less than that, an intermediate combine
function will be used, so that any one combine
or aggregate
function has no more than split_every
inputs. The depth of the aggregation graph will be $log_{split_every}(input chunks along reduced axes)$
. Setting to a low value can reduce cache size and network transfers, at the cost of more CPU and a larger dask graph.
Omit to let dask heuristically decide a good default. A default can also be set globally with the split_every
key in dask.config
.
Prefix of the keys of the intermediate and output nodes. If omitted it defaults to the function names.
Another dask array whose contents will be replaced. Omit to create a new one. Note that, unlike in numpy, this setting gives no performance benefits whatsoever, but can still be useful if one needs to preserve the references to a previously existing Array.
If True (the default), the outputs of the chunk
/ combine
functions are concatenated into a single np.array before being passed to the combine
/ aggregate
functions. If False, the input of combine
and aggregate
will be either a list of the raw outputs of the previous step or a single output, and the function will have to concatenate it itself. It can be useful to set this to False if the chunk and/or combine steps do not produce np.arrays.
Size of the output of the aggregate
function along the reduced axes. Ignored if keepdims is False.
Individual input chunk. For chunk
functions, it is one of the original chunks of x. For combine
and aggregate
functions, it's the concatenation of the outputs produced by the previous chunk
or combine
functions. If concatenate=False, it's a list of the raw outputs from the previous functions.
Normalized list of axes to reduce upon, e.g. (0, )
Scalar, negative, and None axes have been normalized away. Note that some numpy reduction functions cannot reduce along multiple axes at once and strictly require an int in input. Such functions have to be wrapped to cope.
Whether the reduction function should preserve the reduced axes or remove them.
General version of reductions
Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.
Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)
SVG is more flexible but power hungry; and does not scale well to 50 + nodes.
All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them