Document

pandas 1.4.2

Attributes Methods Notes Parameters Raises BackRef

Attributes

categories : Index: The categories of this categorical
codes : ndarray: The codes (integer positions, which point to the categories) of this categorical, read only.
ordered : bool: Whether or not this Categorical is ordered.
dtype : CategoricalDtype: The instance of CategoricalDtype storing the categories and ordered .

:None:None:`Categoricals` can only take on only a limited, and usually fixed, number of possible values (:None:None:`categories`). In contrast to statistical categorical variables, a Categorical might have an order, but numerical operations (additions, divisions, ...) are not possible.

All values of the Categorical are either in :None:None:`categories` or :None:None:`np.nan`. Assigning values outside of :None:None:`categories` will raise a :None:None:`ValueError`. Order is defined by the order of the :None:None:`categories`, not lexical order of the values.

Methods

Notes

See the :None:None:`user guide <https://pandas.pydata.org/pandas-docs/stable/user_guide/categorical.html>` for more.

Parameters

values : list-like: The values of the categorical. If categories are given, values not in categories will be replaced with NaN.
categories : Index-like (unique), optional: The unique categories for this categorical. If not given, the categories are assumed to be the unique values of :None:None:`values` (sorted, if possible, otherwise in the order in which they appear).
ordered : bool, default False: Whether or not this categorical is treated as a ordered categorical. If True, the resulting categorical will be ordered. An ordered categorical respects, when sorted, the order of its :None:None:`categories` attribute (which in turn is the :None:None:`categories` argument, if provided).
dtype : CategoricalDtype: An instance of CategoricalDtype to use for this categorical.

Raises

ValueError: If the categories do not validate.
TypeError: If an explicit ordered=True is given but no :None:None:`categories` and the :None:None:`values` are not sortable.

Represent a categorical variable in classic R / S-plus fashion.

See Also

CategoricalDtype: Type for categorical data.

CategoricalIndex: An Index with an underlying Categorical .

Examples

This example is valid syntax, but we were not able to check execution

>>> pd.Categorical([1, 2, 3, 1, 2, 3])
[1, 2, 3, 1, 2, 3]
Categories (3, int64): [1, 2, 3]

This example is valid syntax, but we were not able to check execution

>>> pd.Categorical(['a', 'b', 'c', 'a', 'b', 'c'])
['a', 'b', 'c', 'a', 'b', 'c']
Categories (3, object): ['a', 'b', 'c']

Missing values are not included as a category.

This example is valid syntax, but we were not able to check execution

>>> c = pd.Categorical([1, 2, 3, 1, 2, 3, np.nan])
... c
[1, 2, 3, 1, 2, 3, NaN]
Categories (3, int64): [1, 2, 3]

However, their presence is indicated in the :None:None:`codes` attribute by code :None:None:`-1`.

This example is valid syntax, but we were not able to check execution

>>> c.codes
array([ 0,  1,  2,  0,  1,  2, -1], dtype=int8)

Ordered :None:None:`Categoricals` can be sorted according to the custom order of the categories and can have a min and max value.

This example is valid syntax, but we were not able to check execution

>>> c = pd.Categorical(['a', 'b', 'c', 'a', 'b', 'c'], ordered=True,
...                    categories=['c', 'b', 'a'])
... c
['a', 'b', 'c', 'a', 'b', 'c']
Categories (3, object): ['c' < 'b' < 'a']

This example is valid syntax, but we were not able to check execution

>>> c.min()
'c'

See :

Back References

The following pages refer to to this document either explicitly or contain code examples using this.

pandas.core.arrays.categorical.Categorical.mode pandas.core.dtypes.concat.union_categoricals pandas.core.arrays.categorical.Categorical pandas.core.dtypes.dtypes.CategoricalDtype pandas.core.indexes.category.CategoricalIndex pandas.core.arrays.categorical.Categorical.min pandas.core.frame.DataFrame.memory_usage pandas.core.reshape.tile.cut pandas.core.arrays.categorical.Categorical.max

Local connectivity graph

Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.

Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)

SVG is more flexible but power hungry; and does not scale well to 50 + nodes.

All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them

File: /pandas/core/arrays/categorical.py#252
type: <class 'abc.ABCMeta'>
Commit: