pandas 1.4.2

NotesParametersReturns
get_groupby(obj: 'NDFrame', by: '_KeysArgType | None' = None, axis: 'int' = 0, level=None, grouper: 'ops.BaseGrouper | None' = None, exclusions=None, selection=None, as_index: 'bool' = True, sort: 'bool' = True, group_keys: 'bool' = True, squeeze: 'bool' = False, observed: 'bool' = False, mutated: 'bool' = False, dropna: 'bool' = True) -> 'GroupBy'

See aggregate, transform, and apply functions on this object.

It's easiest to use obj.groupby(...) to use GroupBy, but you can also do:

grouped = groupby(obj, ...)

Notes

After grouping, see aggregate, apply, and transform functions. Here are some other brief notes about usage. When grouping by multiple groups, the result index will be a MultiIndex (hierarchical) by default.

Iteration produces (key, group) tuples, i.e. chunking the data by group. So you can write code like:

grouped = obj.groupby(keys, axis=axis)
for key, group in grouped:
    # do something with the data

Function calls on GroupBy, if not specially implemented, "dispatch" to the grouped data. So if you group a DataFrame and wish to invoke the std() method on each group, you can simply do:

df.groupby(mapper).std()

rather than

df.groupby(mapper).aggregate(np.std)

You can pass arguments to these "wrapped" functions, too.

See the online documentation for full exposition on these topics and much more

Parameters

obj : pandas object
axis : int, default 0
level : int, default None

Level of MultiIndex

groupings : list of Grouping objects

Most users should ignore this

exclusions : array-like, optional

List of columns to exclude

name : str

Most users should ignore this

Returns

**Attributes**
groups : dict

{group name -> group labels}

len(grouped) : int

Number of groups

Class for grouping and aggregating relational data.

Examples

See :

Local connectivity graph

Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.

Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)

SVG is more flexible but power hungry; and does not scale well to 50 + nodes.

All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them


File: /pandas/core/groupby/groupby.py#3816
type: <class 'function'>
Commit: