union_categoricals(to_union, sort_categories: bool = False, ignore_order: bool = False)
All categories must have the same dtype.
To learn more about categories, see :None:None:`link
<https://pandas.pydata.org/pandas-docs/stable/user_guide/categorical.html#unioning>`
Categorical, CategoricalIndex, or Series with dtype='category'.
If true, resulting categories will be lexsorted, otherwise they will be ordered as they appear in the data.
If true, the ordered attribute of the Categoricals will be ignored. Results in an unordered categorical.
all inputs do not have the same dtype
all inputs do not have the same ordered property
all inputs are ordered and their categories are not identical
sort_categories=True and Categoricals are ordered
Empty list of categoricals passed
Combine list-like of Categorical-like, unioning categories.
>>> from pandas.api.types import union_categoricals
If you want to combine categoricals that do not necessarily have the same categories, union_categoricals
will combine a list-like of categoricals. The new categories will be the union of the categories being combined.
>>> a = pd.Categorical(["b", "c"])
... b = pd.Categorical(["a", "b"])
... union_categoricals([a, b]) ['b', 'c', 'a', 'b'] Categories (3, object): ['b', 'c', 'a']
By default, the resulting categories will be ordered as they appear in the :None:None:`categories`
of the data. If you want the categories to be lexsorted, use :None:None:`sort_categories=True`
argument.
>>> union_categoricals([a, b], sort_categories=True) ['b', 'c', 'a', 'b'] Categories (3, object): ['a', 'b', 'c']
union_categoricals
also works with the case of combining two categoricals of the same categories and order information (e.g. what you could also :None:None:`append`
for).
>>> a = pd.Categorical(["a", "b"], ordered=True)
... b = pd.Categorical(["a", "b", "a"], ordered=True)
... union_categoricals([a, b]) ['a', 'b', 'a', 'b', 'a'] Categories (2, object): ['a' < 'b']
Raises TypeError
because the categories are ordered and not identical.
>>> a = pd.Categorical(["a", "b"], ordered=True)
... b = pd.Categorical(["a", "b", "c"], ordered=True)
... union_categoricals([a, b]) Traceback (most recent call last): ... TypeError: to union ordered Categoricals, all categories must be the same
New in version 0.20.0
Ordered categoricals with different categories or orderings can be combined by using the :None:None:`ignore_ordered=True`
argument.
>>> a = pd.Categorical(["a", "b", "c"], ordered=True)
... b = pd.Categorical(["c", "b", "a"], ordered=True)
... union_categoricals([a, b], ignore_order=True) ['a', 'b', 'c', 'c', 'b', 'a'] Categories (3, object): ['a', 'b', 'c']
union_categoricals
also works with a CategoricalIndex
, or Series
containing categorical data, but note that the resulting array will always be a plain Categorical
>>> a = pd.Series(["b", "c"], dtype='category')See :
... b = pd.Series(["a", "b"], dtype='category')
... union_categoricals([a, b]) ['b', 'c', 'a', 'b'] Categories (3, object): ['b', 'c', 'a']
The following pages refer to to this document either explicitly or contain code examples using this.
pandas.core.dtypes.concat.union_categoricals
Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.
Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)
SVG is more flexible but power hungry; and does not scale well to 50 + nodes.
All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them