pandas 1.4.2

NotesParametersRaisesReturnsBackRef
compare(self, other: 'DataFrame', align_axis: 'Axis' = 1, keep_shape: 'bool' = False, keep_equal: 'bool' = False) -> 'DataFrame'
versionadded

Notes

Matching NaNs will not appear as a difference.

Can only compare identically-labeled (i.e. same shape, identical row and column labels) DataFrames

Parameters

other : DataFrame

Object to compare with.

align_axis : {0 or 'index', 1 or 'columns'}, default 1

Determine which axis to align the comparison on.

  • 0,or'index'

    0, or 'index'

  • 1,or'columns'

    1, or 'columns'

keep_shape : bool, default False

If true, all rows and columns are kept. Otherwise, only the ones with different values are kept.

keep_equal : bool, default False

If true, the result keeps values that are equal. Otherwise, equal values are shown as NaNs.

Raises

ValueError

When the two DataFrames don't have identical labels or shape.

Returns

DataFrame

DataFrame that shows the differences stacked side by side.

The resulting index will be a MultiIndex with 'self' and 'other' stacked alternately at the inner level.

Compare to another DataFrame and show the differences.

See Also

DataFrame.equals

Test whether two objects contain the same elements.

Series.compare

Compare with another Series and show differences.

Examples

This example is valid syntax, but we were not able to check execution
>>> df = pd.DataFrame(
...  {
...  "col1": ["a", "a", "b", "b", "a"],
...  "col2": [1.0, 2.0, 3.0, np.nan, 5.0],
...  "col3": [1.0, 2.0, 3.0, 4.0, 5.0]
...  },
...  columns=["col1", "col2", "col3"],
... )
... df col1 col2 col3 0 a 1.0 1.0 1 a 2.0 2.0 2 b 3.0 3.0 3 b NaN 4.0 4 a 5.0 5.0
This example is valid syntax, but we were not able to check execution
>>> df2 = df.copy()
... df2.loc[0, 'col1'] = 'c'
... df2.loc[2, 'col3'] = 4.0
... df2 col1 col2 col3 0 c 1.0 1.0 1 a 2.0 2.0 2 b 3.0 4.0 3 b NaN 4.0 4 a 5.0 5.0

Align the differences on columns

This example is valid syntax, but we were not able to check execution
>>> df.compare(df2)
  col1       col3
  self other self other
0    a     c  NaN   NaN
2  NaN   NaN  3.0   4.0

Stack the differences on rows

This example is valid syntax, but we were not able to check execution
>>> df.compare(df2, align_axis=0)
        col1  col3
0 self     a   NaN
  other    c   NaN
2 self   NaN   3.0
  other  NaN   4.0

Keep the equal values

This example is valid syntax, but we were not able to check execution
>>> df.compare(df2, keep_equal=True)
  col1       col3
  self other self other
0    a     c  1.0   1.0
2    b     b  3.0   4.0

Keep all original rows and columns

This example is valid syntax, but we were not able to check execution
>>> df.compare(df2, keep_shape=True)
  col1       col2       col3
  self other self other self other
0    a     c  NaN   NaN  NaN   NaN
1  NaN   NaN  NaN   NaN  NaN   NaN
2  NaN   NaN  NaN   NaN  3.0   4.0
3  NaN   NaN  NaN   NaN  NaN   NaN
4  NaN   NaN  NaN   NaN  NaN   NaN

Keep all original rows and columns and also all original values

This example is valid syntax, but we were not able to check execution
>>> df.compare(df2, keep_shape=True, keep_equal=True)
  col1       col2       col3
  self other self other self other
0    a     c  1.0   1.0  1.0   1.0
1    a     a  2.0   2.0  2.0   2.0
2    b     b  3.0   3.0  3.0   4.0
3    b     b  NaN   NaN  4.0   4.0
4    a     a  5.0   5.0  5.0   5.0
See :

Back References

The following pages refer to to this document either explicitly or contain code examples using this.

pandas.core.series.Series.compare

Local connectivity graph

Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.

Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)

SVG is more flexible but power hungry; and does not scale well to 50 + nodes.

All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them


File: /pandas/core/frame.py#7080
type: <class 'function'>
Commit: