Document

pandas 1.4.2

Notes Parameters Returns BackRef

nlargest(self, n: 'int', columns: 'IndexLabel', keep: 'str' = 'first') -> 'DataFrame'

Return the first n rows with the largest values in :None:None:`columns`, in descending order. The columns that are not specified are returned as well, but not used for ordering.

This method is equivalent to df.sort_values(columns, ascending=False).head(n) , but more performant.

Notes

This function cannot be used with all column types. For example, when specifying columns with :None:None:`object` or category dtypes, TypeError is raised.

Parameters

n : int: Number of rows to return.
columns : label or list of labels: Column label(s) to order by.
keep : {'first', 'last', 'all'}, default 'first': Where there are duplicate values:

Returns

DataFrame: The first n rows ordered by the given columns in descending order.

Return the first n rows ordered by :None:None:`columns` in descending order.

See Also

DataFrame.head: Return the first :None:None:`n` rows without re-ordering.

DataFrame.nsmallest: Return the first :None:None:`n` rows ordered by :None:None:`columns` in ascending order.

DataFrame.sort_values: Sort DataFrame by the values.

Examples

This example is valid syntax, but we were not able to check execution

>>> df = pd.DataFrame({'population': [59000000, 65000000, 434000,
...                                   434000, 434000, 337000, 11300,
...                                   11300, 11300],
...                    'GDP': [1937894, 2583560 , 12011, 4520, 12128,
...                            17036, 182, 38, 311],
...                    'alpha-2': ["IT", "FR", "MT", "MV", "BN",
...                                "IS", "NR", "TV", "AI"]},
...                   index=["Italy", "France", "Malta",
...                          "Maldives", "Brunei", "Iceland",
...                          "Nauru", "Tuvalu", "Anguilla"])
... df
          population      GDP alpha-2
Italy       59000000  1937894      IT
France      65000000  2583560      FR
Malta         434000    12011      MT
Maldives      434000     4520      MV
Brunei        434000    12128      BN
Iceland       337000    17036      IS
Nauru          11300      182      NR
Tuvalu         11300       38      TV
Anguilla       11300      311      AI

In the following example, we will use nlargest to select the three rows having the largest values in column "population".

This example is valid syntax, but we were not able to check execution

>>> df.nlargest(3, 'population')
        population      GDP alpha-2
France    65000000  2583560      FR
Italy     59000000  1937894      IT
Malta       434000    12011      MT

When using keep='last' , ties are resolved in reverse order:

This example is valid syntax, but we were not able to check execution

>>> df.nlargest(3, 'population', keep='last')
        population      GDP alpha-2
France    65000000  2583560      FR
Italy     59000000  1937894      IT
Brunei      434000    12128      BN

When using keep='all' , all duplicate items are maintained:

This example is valid syntax, but we were not able to check execution

>>> df.nlargest(3, 'population', keep='all')
          population      GDP alpha-2
France      65000000  2583560      FR
Italy       59000000  1937894      IT
Malta         434000    12011      MT
Maldives      434000     4520      MV
Brunei        434000    12128      BN

To order by the largest values in column "population" and then "GDP", we can specify multiple columns like in the next example.

This example is valid syntax, but we were not able to check execution

>>> df.nlargest(3, ['population', 'GDP'])
        population      GDP alpha-2
France    65000000  2583560      FR
Italy     59000000  1937894      IT
Brunei      434000    12128      BN

See :

Back References

The following pages refer to to this document either explicitly or contain code examples using this.

pandas.core.frame.DataFrame.nsmallest

Local connectivity graph

Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.

Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)

SVG is more flexible but power hungry; and does not scale well to 50 + nodes.

All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them

File: /pandas/core/frame.py#6585
type: <class 'function'>
Commit: