dask 2021.10.0

NotesParametersReturnsBackRef
corrcoef(x, y=None, rowvar=1)

This docstring was copied from numpy.corrcoef.

Some inconsistencies with the Dask version may exist.

Please refer to the documentation for cov for more detail. The relationship between the correlation coefficient matrix, R, and the covariance matrix, :None:None:`C`, is

$$R_{ij} = \frac{ C_{ij} } { \sqrt{ C_{ii} * C_{jj} } }$$

The values of R are between -1 and 1, inclusive.

Notes

Due to floating point rounding the resulting array may not be Hermitian, the diagonal elements may not be 1, and the elements may not satisfy the inequality abs(a) <= 1. The real and imaginary parts are clipped to the interval [-1, 1] in an attempt to improve on that situation but is not much help in the complex case.

This function accepts but discards arguments :None:None:`bias` and :None:None:`ddof`. This is for backwards compatibility with previous versions of this function. These arguments had no effect on the return values of the function and can be safely ignored in this and previous versions of numpy.

Parameters

x : array_like

A 1-D or 2-D array containing multiple variables and observations. Each row of x represents a variable, and each column a single observation of all those variables. Also see :None:None:`rowvar` below.

y : array_like, optional

An additional set of variables and observations. y has the same shape as x.

rowvar : bool, optional

If :None:None:`rowvar` is True (default), then each row represents a variable, with observations in the columns. Otherwise, the relationship is transposed: each column represents a variable, while the rows contain observations.

bias : _NoValue, optional (Not supported in Dask)

Has no effect, do not use.

deprecated
ddof : _NoValue, optional (Not supported in Dask)

Has no effect, do not use.

deprecated
dtype : data-type, optional (Not supported in Dask)

Data-type of the result. By default, the return data-type will have at least numpy.float64 precision.

versionadded

Returns

R : ndarray

The correlation coefficient matrix of the variables.

Return Pearson product-moment correlation coefficients.

See Also

cov

Covariance matrix

Examples

In this example we generate two random arrays, xarr and yarr , and compute the row-wise and column-wise Pearson correlation coefficients, R . Since rowvar is true by default, we first find the row-wise Pearson correlation coefficients between the variables of xarr .

This example is valid syntax, but we were not able to check execution
>>> import numpy as np  # doctest: +SKIP
... rng = np.random.default_rng(seed=42) # doctest: +SKIP
... xarr = rng.random((3, 3)) # doctest: +SKIP
... xarr # doctest: +SKIP array([[0.77395605, 0.43887844, 0.85859792], [0.69736803, 0.09417735, 0.97562235], [0.7611397 , 0.78606431, 0.12811363]])
This example is valid syntax, but we were not able to check execution
>>> R1 = np.corrcoef(xarr)  # doctest: +SKIP
... R1 # doctest: +SKIP array([[ 1. , 0.99256089, -0.68080986], [ 0.99256089, 1. , -0.76492172], [-0.68080986, -0.76492172, 1. ]])

If we add another set of variables and observations yarr , we can compute the row-wise Pearson correlation coefficients between the variables in xarr and yarr .

This example is valid syntax, but we were not able to check execution
>>> yarr = rng.random((3, 3))  # doctest: +SKIP
... yarr # doctest: +SKIP array([[0.45038594, 0.37079802, 0.92676499], [0.64386512, 0.82276161, 0.4434142 ], [0.22723872, 0.55458479, 0.06381726]])
This example is valid syntax, but we were not able to check execution
>>> R2 = np.corrcoef(xarr, yarr)  # doctest: +SKIP
... R2 # doctest: +SKIP array([[ 1. , 0.99256089, -0.68080986, 0.75008178, -0.934284 , -0.99004057], [ 0.99256089, 1. , -0.76492172, 0.82502011, -0.97074098, -0.99981569], [-0.68080986, -0.76492172, 1. , -0.99507202, 0.89721355, 0.77714685], [ 0.75008178, 0.82502011, -0.99507202, 1. , -0.93657855, -0.83571711], [-0.934284 , -0.97074098, 0.89721355, -0.93657855, 1. , 0.97517215], [-0.99004057, -0.99981569, 0.77714685, -0.83571711, 0.97517215, 1. ]])

Finally if we use the option rowvar=False , the columns are now being treated as the variables and we will find the column-wise Pearson correlation coefficients between variables in xarr and yarr .

This example is valid syntax, but we were not able to check execution
>>> R3 = np.corrcoef(xarr, yarr, rowvar=False)  # doctest: +SKIP
... R3 # doctest: +SKIP array([[ 1. , 0.77598074, -0.47458546, -0.75078643, -0.9665554 , 0.22423734], [ 0.77598074, 1. , -0.92346708, -0.99923895, -0.58826587, -0.44069024], [-0.47458546, -0.92346708, 1. , 0.93773029, 0.23297648, 0.75137473], [-0.75078643, -0.99923895, 0.93773029, 1. , 0.55627469, 0.47536961], [-0.9665554 , -0.58826587, 0.23297648, 0.55627469, 1. , -0.46666491], [ 0.22423734, -0.44069024, 0.75137473, 0.47536961, -0.46666491, 1. ]])
See :

Back References

The following pages refer to to this document either explicitly or contain code examples using this.

dask.array.routines.cov

Local connectivity graph

Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.

Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)

SVG is more flexible but power hungry; and does not scale well to 50 + nodes.

All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them


File: /dask/array/routines.py#1518
type: <class 'function'>
Commit: