Document

dask 2021.10.0

Parameters BackRef

blockwise(func, out_ind, *args, name=None, token=None, dtype=None, adjust_chunks=None, new_axes=None, align_arrays=True, concatenate=None, meta=None, **kwargs)

A broad class of blocked algorithms and patterns can be specified with a concise multi-index notation. The blockwise function applies an in-memory function across multiple blocks of multiple inputs in a variety of ways. Many dask.array operations are special cases of blockwise including elementwise, broadcasting, reductions, tensordot, and transpose.

Parameters

func : callable: Function to apply to individual tuples of blocks
out_ind : iterable: Block pattern of the output, something like 'ijk' or (1, 2, 3)
*args : sequence of Array, index pairs: Sequence like (x, 'ij', y, 'jk', z, 'i')
**kwargs : dict: Extra keyword arguments to pass to function
dtype : np.dtype: Datatype of resulting array.
concatenate : bool, keyword only: If true concatenate arrays along dummy indices, else provide lists
adjust_chunks : dict: Dictionary mapping index to function to be applied to chunk sizes
new_axes : dict, keyword only: New indexes and their dimension lengths

Tensor operation: Generalized inner and outer products

Examples

2D embarrassingly parallel operation from two arrays, x, and y.

This example is valid syntax, but we were not able to check execution

>>> import operator, numpy as np, dask.array as da
... x = da.from_array([[1, 2],
...                    [3, 4]], chunks=(1, 2))
... y = da.from_array([[10, 20],
...                    [0, 0]])
... z = blockwise(operator.add, 'ij', x, 'ij', y, 'ij', dtype='f8')
... z.compute()
array([[11, 22],
       [ 3,  4]])

Outer product multiplying a by b, two 1-d vectors

This example is valid syntax, but we were not able to check execution

>>> a = da.from_array([0, 1, 2], chunks=1)
... b = da.from_array([10, 50, 100], chunks=1)
... z = blockwise(np.outer, 'ij', a, 'i', b, 'j', dtype='f8')
... z.compute()
array([[  0,   0,   0],
       [ 10,  50, 100],
       [ 20, 100, 200]])

z = x.T

This example is valid syntax, but we were not able to check execution

>>> z = blockwise(np.transpose, 'ji', x, 'ij', dtype=x.dtype)
... z.compute()
array([[1, 3],
       [2, 4]])

The transpose case above is illustrative because it does transposition both on each in-memory block by calling np.transpose and on the order of the blocks themselves, by switching the order of the index ij -> ji .

We can compose these same patterns with more variables and more complex in-memory functions

z = X + Y.T

This example is valid syntax, but we were not able to check execution

>>> z = blockwise(lambda x, y: x + y.T, 'ij', x, 'ij', y, 'ji', dtype='f8')
... z.compute()
array([[11,  2],
       [23,  4]])

Any index, like i missing from the output index is interpreted as a contraction (note that this differs from Einstein convention; repeated indices do not imply contraction.) In the case of a contraction the passed function should expect an iterable of blocks on any array that holds that index. To receive arrays concatenated along contracted dimensions instead pass concatenate=True .

Inner product multiplying a by b, two 1-d vectors

This example is valid syntax, but we were not able to check execution

>>> def sequence_dot(a_blocks, b_blocks):
...     result = 0
...     for a, b in zip(a_blocks, b_blocks):
...         result += a.dot(b)
...     return result

This example is valid syntax, but we were not able to check execution

>>> z = blockwise(sequence_dot, '', a, 'i', b, 'i', dtype='f8')
... z.compute()
250

Add new single-chunk dimensions with the new_axes= keyword, including the length of the new dimension. New dimensions will always be in a single chunk.

This example is valid syntax, but we were not able to check execution

>>> def f(a):
...     return a[:, None] * np.ones((1, 5))

This example is valid syntax, but we were not able to check execution

>>> z = blockwise(f, 'az', a, 'a', new_axes={'z': 5}, dtype=a.dtype)

New dimensions can also be multi-chunk by specifying a tuple of chunk sizes. This has limited utility as is (because the chunks are all the same), but the resulting graph can be modified to achieve more useful results (see da.map_blocks ).

This example is valid syntax, but we were not able to check execution

>>> z = blockwise(f, 'az', a, 'a', new_axes={'z': (5, 5)}, dtype=x.dtype)
... z.chunks
((1, 1, 1), (5, 5))

If the applied function changes the size of each chunk you can specify this with a adjust_chunks={...} dictionary holding a function for each index that modifies the dimension size in that index.

This example is valid syntax, but we were not able to check execution

>>> def double(x):
...     return np.concatenate([x, x])

This example is valid syntax, but we were not able to check execution

>>> y = blockwise(double, 'ij', x, 'ij',
...               adjust_chunks={'i': lambda n: 2 * n}, dtype=x.dtype)
... y.chunks
((2, 2), (2,))

Include literals by indexing with None

This example is valid syntax, but we were not able to check execution

>>> z = blockwise(operator.add, 'ij', x, 'ij', 1234, None, dtype=x.dtype)
... z.compute()
array([[1235, 1236],
       [1237, 1238]])

See :

Back References

The following pages refer to to this document either explicitly or contain code examples using this.

dask.array.core.map_blocks dask.blockwise.make_blockwise_graph dask.array.blockwise.blockwise

Local connectivity graph

Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.

Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)

SVG is more flexible but power hungry; and does not scale well to 50 + nodes.

All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them

File: /dask/array/blockwise.py#12
type: <class 'function'>
Commit: