Document

pandas 1.4.2

Notes Parameters Returns

value_counts(self, subset: 'Sequence[Hashable] | None' = None, normalize: 'bool' = False, sort: 'bool' = True, ascending: 'bool' = False, dropna: 'bool' = True) -> 'DataFrame | Series'

versionadded

Notes

If the groupby as_index is True then the returned Series will have a MultiIndex with one level per input column.
If the groupby as_index is False then the returned DataFrame will have an additional column with the value_counts. The column is labelled 'count' or 'proportion', depending on the normalize parameter.

By default, rows that contain any NA values are omitted from the result.

By default, the result will be in descending order so that the first element of each group is the most frequently-occurring row.

Parameters

subset : list-like, optional: Columns to use when counting unique combinations.
normalize : bool, default False: Return proportions rather than frequencies.
sort : bool, default True: Sort by frequencies.
ascending : bool, default False: Sort in ascending order.
dropna : bool, default True: Don’t include counts of rows that contain NA values.

Returns

Series or DataFrame: Series if the groupby as_index is True, otherwise DataFrame.

Return a Series or DataFrame containing counts of unique rows.

See Also

DataFrame.value_counts: Equivalent method on DataFrame.

Series.value_counts: Equivalent method on Series.

SeriesGroupBy.value_counts: Equivalent method on SeriesGroupBy.

Examples

This example is valid syntax, but we were not able to check execution

>>> df = pd.DataFrame({
...    'gender': ['male', 'male', 'female', 'male', 'female', 'male'],
...    'education': ['low', 'medium', 'high', 'low', 'high', 'low'],
...    'country': ['US', 'FR', 'US', 'FR', 'FR', 'FR']
... })

This example is valid syntax, but we were not able to check execution

>>> df
    gender 	education 	country
0 	male 	low 	    US
1 	male 	medium 	    FR
2 	female 	high 	    US
3 	male 	low 	    FR
4 	female 	high 	    FR
5 	male 	low 	    FR

This example is valid syntax, but we were not able to check execution

>>> df.groupby('gender').value_counts()
gender  education  country
female  high       FR         1
                   US         1
male    low        FR         2
                   US         1
        medium     FR         1
dtype: int64

This example is valid syntax, but we were not able to check execution

>>> df.groupby('gender').value_counts(ascending=True)
gender  education  country
female  high       FR         1
                   US         1
male    low        US         1
        medium     FR         1
        low        FR         2
dtype: int64

This example is valid syntax, but we were not able to check execution

>>> df.groupby('gender').value_counts(normalize=True)
gender  education  country
female  high       FR         0.50
                   US         0.50
male    low        FR         0.50
                   US         0.25
        medium     FR         0.25
dtype: float64

This example is valid syntax, but we were not able to check execution

>>> df.groupby('gender', as_index=False).value_counts()
   gender education country  count
0  female      high      FR      1
1  female      high      US      1
2    male       low      FR      2
3    male       low      US      1
4    male    medium      FR      1

This example is valid syntax, but we were not able to check execution

>>> df.groupby('gender', as_index=False).value_counts(normalize=True)
   gender education country  proportion
0  female      high      FR        0.50
1  female      high      US        0.50
2    male       low      FR        0.50
3    male       low      US        0.25
4    male    medium      FR        0.25

See :

Local connectivity graph

Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.

Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)

SVG is more flexible but power hungry; and does not scale well to 50 + nodes.

All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them

File: /pandas/core/groupby/generic.py#1575
type: <class 'function'>
Commit: