apply_gufunc(func, signature, *args, axes=None, axis=None, keepdims=False, output_dtypes=None, output_sizes=None, vectorize=None, allow_rechunk=False, meta=None, **kwargs)
signature
determines if the function consumes or produces core dimensions. The remaining dimensions in given input arrays ( *args
) are considered loop dimensions and are required to broadcast naturally against each other.
In other terms, this function is like np.vectorize
, but for the blocks of dask arrays. If the function itself shall also be vectorized use vectorize=True
for convenience.
Function to call like func(*args, **kwargs)
on input arrays ( *args
) that returns an array or tuple of arrays. If multiple arguments with non-matching dimensions are supplied, this function is expected to vectorize (broadcast) over axes of positional arguments in the style of NumPy universal functions (if this is not the case, set vectorize=True
). If this function returns multiple outputs, output_core_dims
has to be set as well.
Specifies what core dimensions are consumed and produced by func
. According to the specification of numpy.gufunc signature
Input arrays or scalars to the callable function.
A list of tuples with indices of axes a generalized ufunc should operate on. For instance, for a signature of "(i,j),(j,k)->(i,k)"
appropriate for matrix multiplication, the base elements are two-dimensional matrices and these are taken to be stored in the two last axes of each argument. The corresponding axes keyword would be [(-2, -1), (-2, -1), (-2, -1)]
. For simplicity, for generalized ufuncs that operate on 1-dimensional arrays (vectors), a single integer is accepted instead of a single-element tuple, and for generalized ufuncs for which all outputs are scalars, the output tuples can be omitted.
A single axis over which a generalized ufunc should operate. This is a short-cut for ufuncs that operate over a single, shared core dimension, equivalent to passing in axes with entries of (axis,) for each single-core-dimension argument and ()
for all others. For instance, for a signature "(i),(i)->()"
, it is equivalent to passing in axes=[(axis,), (axis,), ()]
.
If this is set to True, axes which are reduced over will be left in the result as a dimension with size one, so that the result will broadcast correctly against the inputs. This option can only be used for generalized ufuncs that operate on inputs that all have the same number of core dimensions and with outputs that have no core dimensions , i.e., with signatures like "(i),(i)->()"
or "(m,m)->()"
. If used, the location of the dimensions in the output can be controlled with axes and axis.
Valid numpy dtype specification or list thereof. If not given, a call of func
with a small set of data is performed in order to try to automatically determine the output dtypes.
Optional mapping from dimension names to sizes for outputs. Only used if new core dimensions (not found on inputs) appear on outputs.
If set to True
, np.vectorize
is applied to func
for convenience. Defaults to False
.
Allows rechunking, otherwise chunk sizes need to match and core dimensions are to consist only of one chunk. Warning: enabling this can increase memory usage significantly. Defaults to False
.
tuple of empty ndarrays describing the shape and dtype of the output of the gufunc. Defaults to None
.
Extra keyword arguments to pass to func
Apply a generalized ufunc or similar python function to arrays.
>>> import dask.array as daThis example is valid syntax, but we were not able to check execution
... import numpy as np
... def stats(x):
... return np.mean(x, axis=-1), np.std(x, axis=-1)
... a = da.random.normal(size=(10,20,30), chunks=(5, 10, 30))
... mean, std = da.apply_gufunc(stats, "(i)->(),()", a)
... mean.compute().shape (10, 20)
>>> def outer_product(x, y):See :
... return np.einsum("i,j->ij", x, y)
... a = da.random.normal(size=( 20,30), chunks=(10, 30))
... b = da.random.normal(size=(10, 1,40), chunks=(5, 1, 40))
... c = da.apply_gufunc(outer_product, "(i),(j)->(i,j)", a, b, vectorize=True)
... c.compute().shape (10, 20, 30, 40)
The following pages refer to to this document either explicitly or contain code examples using this.
dask.array.gufunc.apply_gufunc
Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.
Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)
SVG is more flexible but power hungry; and does not scale well to 50 + nodes.
All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them