Document

scipy 1.8.0 Pypi GitHub Homepage

Attributes

data : ndarray, shape (n,m): The n data points of dimension m to be indexed. This array is not copied unless this is necessary to produce a contiguous array of doubles. The data are also copied if the kd-tree is built with :None:None:`copy_data=True`.
leafsize : positive int: The number of points at which the algorithm switches over to brute-force.
m : int: The dimension of a single data-point.
n : int: The number of data points.
maxes : ndarray, shape (m,): The maximum value in each dimension of the n data points.
mins : ndarray, shape (m,): The minimum value in each dimension of the n data points.
size : int: The number of nodes in the tree.

This class provides an index into a set of k-dimensional points which can be used to rapidly look up the nearest neighbors of any point.

Notes

The algorithm used is described in Maneewongvatana and Mount 1999. The general idea is that the kd-tree is a binary tree, each of whose nodes represents an axis-aligned hyperrectangle. Each node specifies an axis and splits the set of points based on whether their coordinate along that axis is greater than or less than a particular value.

During construction, the axis and splitting point are chosen by the "sliding midpoint" rule, which ensures that the cells do not all become long and thin.

The tree can be queried for the r closest neighbors of any given point (optionally returning only those within some maximum distance of the point). It can also be queried, with a substantial gain in efficiency, for the r approximate closest neighbors.

For large dimensions (20 is already large) do not expect this to run significantly faster than brute force. High-dimensional nearest-neighbor queries are a substantial open problem in computer science.

Parameters

data : array_like, shape (n,m): The n data points of dimension m to be indexed. This array is not copied unless this is necessary to produce a contiguous array of doubles, and so modifying this data will result in bogus results. The data are also copied if the kd-tree is built with copy_data=True.
leafsize : positive int, optional: The number of points at which the algorithm switches over to brute-force. Default: 10.
compact_nodes : bool, optional: If True, the kd-tree is built to shrink the hyperrectangles to the actual data range. This usually gives a more compact tree that is robust against degenerated input data and gives faster queries at the expense of longer build time. Default: True.
copy_data : bool, optional: If True the data is always copied to protect the kd-tree against data corruption. Default: False.
balanced_tree : bool, optional: If True, the median is used to split the hyperrectangles instead of the midpoint. This usually gives a more compact tree and faster queries at the expense of longer build time. Default: True.
boxsize : array_like or scalar, optional: Apply a m-d toroidal topology to the KDTree.. The topology is generated by $x_i + n_i L_i$ where $n_i$ are integers and $L_i$ is the boxsize along i-th dimension. The input data shall be wrapped into $[0, L_i)$ . A ValueError is raised if any of the data is outside of this bound.

kd-tree for quick nearest-neighbor lookup.

Examples

See :

Back References

The following pages refer to to this document either explicitly or contain code examples using this.

scipy.spatial._kdtree.KDTree.query_ball_point scipy.spatial._kdtree.KDTree.query_ball_tree scipy.spatial._kdtree.KDTree.count_neighbors scipy.spatial._kdtree.KDTree.sparse_distance_matrix scipy.spatial._kdtree.KDTree.query scipy.spatial._kdtree.KDTree.query_pairs

Local connectivity graph

Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.

Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)

SVG is more flexible but power hungry; and does not scale well to 50 + nodes.

All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them

GitHub : /scipy/spatial/_kdtree.py#189
type: <class 'type'>
Commit: