dask 2021.10.0

NotesParametersReturns
hypergeometric(self, ngood, nbad, nsample, size=None, chunks='auto', **kwargs)

This docstring was copied from numpy.random.mtrand.RandomState.hypergeometric.

Some inconsistencies with the Dask version may exist.

Samples are drawn from a hypergeometric distribution with specified parameters, :None:None:`ngood` (ways to make a good selection), :None:None:`nbad` (ways to make a bad selection), and :None:None:`nsample` (number of items sampled, which is less than or equal to the sum ngood + nbad ).

note

New code should use the hypergeometric method of a default_rng() instance instead; please see the :None:ref:`random-quick-start`.

Notes

The probability density for the Hypergeometric distribution is

$$P(x) = \frac{\binom{g}{x}\binom{b}{n-x}}{\binom{g+b}{n}},$$

where $0 \le x \le n$ and $n-b \le x \le g$

for P(x) the probability of x good results in the drawn sample, g = :None:None:`ngood`, b = :None:None:`nbad`, and n = :None:None:`nsample`.

Consider an urn with black and white marbles in it, :None:None:`ngood` of them are black and :None:None:`nbad` are white. If you draw :None:None:`nsample` balls without replacement, then the hypergeometric distribution describes the distribution of black balls in the drawn sample.

Note that this distribution is very similar to the binomial distribution, except that in this case, samples are drawn without replacement, whereas in the Binomial case samples are drawn with replacement (or the sample space is infinite). As the sample space becomes large, this distribution approaches the binomial.

Parameters

ngood : int or array_like of ints

Number of ways to make a good selection. Must be nonnegative.

nbad : int or array_like of ints

Number of ways to make a bad selection. Must be nonnegative.

nsample : int or array_like of ints

Number of items sampled. Must be at least 1 and at most ngood + nbad .

size : int or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k) , then m * n * k samples are drawn. If size is None (default), a single value is returned if :None:None:`ngood`, :None:None:`nbad`, and :None:None:`nsample` are all scalars. Otherwise, np.broadcast(ngood, nbad, nsample).size samples are drawn.

Returns

out : ndarray or scalar

Drawn samples from the parameterized hypergeometric distribution. Each sample is the number of good items within a randomly selected subset of size :None:None:`nsample` taken from a set of :None:None:`ngood` good items and :None:None:`nbad` bad items.

Draw samples from a Hypergeometric distribution.

See Also

Generator.hypergeometric

which should be used for new code.

scipy.stats.hypergeom

probability density function, distribution or cumulative density function, etc.

Examples

Draw samples from the distribution:

This example is valid syntax, but we were not able to check execution
>>> ngood, nbad, nsamp = 100, 2, 10  # doctest: +SKIP
# number of good, number of bad, and number of samples
This example is valid syntax, but we were not able to check execution
>>> s = np.random.hypergeometric(ngood, nbad, nsamp, 1000)  # doctest: +SKIP
... from matplotlib.pyplot import hist # doctest: +SKIP
... hist(s) # doctest: +SKIP # note that it is very unlikely to grab both bad items

Suppose you have an urn with 15 white and 15 black marbles. If you pull 15 marbles at random, how likely is it that 12 or more of them are one color?

This example is valid syntax, but we were not able to check execution
>>> s = np.random.hypergeometric(15, 15, 15, 100000)  # doctest: +SKIP
... sum(s>=12)/100000. + sum(s<=3)/100000. # doctest: +SKIP # answer = 0.003 ... pretty unlikely!
See :

Local connectivity graph

Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.

Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)

SVG is more flexible but power hungry; and does not scale well to 50 + nodes.

All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them


File: /dask/array/random.py#301
type: <class 'function'>
Commit: