pandas 1.4.2

ParametersReturnsBackRef
merge_asof(left: 'DataFrame | Series', right: 'DataFrame | Series', on: 'IndexLabel | None' = None, left_on: 'IndexLabel | None' = None, right_on: 'IndexLabel | None' = None, left_index: 'bool' = False, right_index: 'bool' = False, by=None, left_by=None, right_by=None, suffixes: 'Suffixes' = ('_x', '_y'), tolerance=None, allow_exact_matches: 'bool' = True, direction: 'str' = 'backward') -> 'DataFrame'

This is similar to a left-join except that we match on nearest key rather than equal keys. Both DataFrames must be sorted by the key.

For each row in the left DataFrame:

The default is "backward" and is compatible in versions below 0.20.0. The direction parameter was added in version 0.20.0 and introduces "forward" and "nearest".

Optionally match on equivalent keys with 'by' before searching with 'on'.

Parameters

left : DataFrame or named Series
right : DataFrame or named Series
on : label

Field name to join on. Must be found in both DataFrames. The data MUST be ordered. Furthermore this must be a numeric column, such as datetimelike, integer, or float. On or left_on/right_on must be given.

left_on : label

Field name to join on in left DataFrame.

right_on : label

Field name to join on in right DataFrame.

left_index : bool

Use the index of the left DataFrame as the join key.

right_index : bool

Use the index of the right DataFrame as the join key.

by : column name or list of column names

Match on these columns before performing merge operation.

left_by : column name

Field names to match on in the left DataFrame.

right_by : column name

Field names to match on in the right DataFrame.

suffixes : 2-length sequence (tuple, list, ...)

Suffix to apply to overlapping column names in the left and right side, respectively.

tolerance : int or Timedelta, optional, default None

Select asof tolerance within this range; must be compatible with the merge index.

allow_exact_matches : bool, default True
  • If True, allow matching with the same 'on' value (i.e. less-than-or-equal-to / greater-than-or-equal-to)

  • If False, don't match the same 'on' value (i.e., strictly less-than / strictly greater-than).

direction : 'backward' (default), 'forward', or 'nearest'

Whether to search for prior, subsequent, or closest matches.

Returns

merged : DataFrame

Perform a merge by key distance.

See Also

merge

Merge with a database-style join.

merge_ordered

Merge with optional filling/interpolation.

Examples

This example is valid syntax, but we were not able to check execution
>>> left = pd.DataFrame({"a": [1, 5, 10], "left_val": ["a", "b", "c"]})
... left a left_val 0 1 a 1 5 b 2 10 c
This example is valid syntax, but we were not able to check execution
>>> right = pd.DataFrame({"a": [1, 2, 3, 6, 7], "right_val": [1, 2, 3, 6, 7]})
... right a right_val 0 1 1 1 2 2 2 3 3 3 6 6 4 7 7
This example is valid syntax, but we were not able to check execution
>>> pd.merge_asof(left, right, on="a")
    a left_val  right_val
0   1        a          1
1   5        b          3
2  10        c          7
This example is valid syntax, but we were not able to check execution
>>> pd.merge_asof(left, right, on="a", allow_exact_matches=False)
    a left_val  right_val
0   1        a        NaN
1   5        b        3.0
2  10        c        7.0
This example is valid syntax, but we were not able to check execution
>>> pd.merge_asof(left, right, on="a", direction="forward")
    a left_val  right_val
0   1        a        1.0
1   5        b        6.0
2  10        c        NaN
This example is valid syntax, but we were not able to check execution
>>> pd.merge_asof(left, right, on="a", direction="nearest")
    a left_val  right_val
0   1        a          1
1   5        b          6
2  10        c          7

We can use indexed DataFrames as well.

This example is valid syntax, but we were not able to check execution
>>> left = pd.DataFrame({"left_val": ["a", "b", "c"]}, index=[1, 5, 10])
... left left_val 1 a 5 b 10 c
This example is valid syntax, but we were not able to check execution
>>> right = pd.DataFrame({"right_val": [1, 2, 3, 6, 7]}, index=[1, 2, 3, 6, 7])
... right right_val 1 1 2 2 3 3 6 6 7 7
This example is valid syntax, but we were not able to check execution
>>> pd.merge_asof(left, right, left_index=True, right_index=True)
   left_val  right_val
1         a          1
5         b          3
10        c          7

Here is a real-world times-series example

This example is valid syntax, but we were not able to check execution
>>> quotes = pd.DataFrame(
...  {
...  "time": [
...  pd.Timestamp("2016-05-25 13:30:00.023"),
...  pd.Timestamp("2016-05-25 13:30:00.023"),
...  pd.Timestamp("2016-05-25 13:30:00.030"),
...  pd.Timestamp("2016-05-25 13:30:00.041"),
...  pd.Timestamp("2016-05-25 13:30:00.048"),
...  pd.Timestamp("2016-05-25 13:30:00.049"),
...  pd.Timestamp("2016-05-25 13:30:00.072"),
...  pd.Timestamp("2016-05-25 13:30:00.075")
...  ],
...  "ticker": [
...  "GOOG",
...  "MSFT",
...  "MSFT",
...  "MSFT",
...  "GOOG",
...  "AAPL",
...  "GOOG",
...  "MSFT"
...  ],
...  "bid": [720.50, 51.95, 51.97, 51.99, 720.50, 97.99, 720.50, 52.01],
...  "ask": [720.93, 51.96, 51.98, 52.00, 720.93, 98.01, 720.88, 52.03]
...  }
... )
... quotes time ticker bid ask 0 2016-05-25 13:30:00.023 GOOG 720.50 720.93 1 2016-05-25 13:30:00.023 MSFT 51.95 51.96 2 2016-05-25 13:30:00.030 MSFT 51.97 51.98 3 2016-05-25 13:30:00.041 MSFT 51.99 52.00 4 2016-05-25 13:30:00.048 GOOG 720.50 720.93 5 2016-05-25 13:30:00.049 AAPL 97.99 98.01 6 2016-05-25 13:30:00.072 GOOG 720.50 720.88 7 2016-05-25 13:30:00.075 MSFT 52.01 52.03
This example is valid syntax, but we were not able to check execution
>>> trades = pd.DataFrame(
...  {
...  "time": [
...  pd.Timestamp("2016-05-25 13:30:00.023"),
...  pd.Timestamp("2016-05-25 13:30:00.038"),
...  pd.Timestamp("2016-05-25 13:30:00.048"),
...  pd.Timestamp("2016-05-25 13:30:00.048"),
...  pd.Timestamp("2016-05-25 13:30:00.048")
...  ],
...  "ticker": ["MSFT", "MSFT", "GOOG", "GOOG", "AAPL"],
...  "price": [51.95, 51.95, 720.77, 720.92, 98.0],
...  "quantity": [75, 155, 100, 100, 100]
...  }
...  )
... trades time ticker price quantity 0 2016-05-25 13:30:00.023 MSFT 51.95 75 1 2016-05-25 13:30:00.038 MSFT 51.95 155 2 2016-05-25 13:30:00.048 GOOG 720.77 100 3 2016-05-25 13:30:00.048 GOOG 720.92 100 4 2016-05-25 13:30:00.048 AAPL 98.00 100

By default we are taking the asof of the quotes

This example is valid syntax, but we were not able to check execution
>>> pd.merge_asof(trades, quotes, on="time", by="ticker")
                     time ticker   price  quantity     bid     ask
0 2016-05-25 13:30:00.023   MSFT   51.95        75   51.95   51.96
1 2016-05-25 13:30:00.038   MSFT   51.95       155   51.97   51.98
2 2016-05-25 13:30:00.048   GOOG  720.77       100  720.50  720.93
3 2016-05-25 13:30:00.048   GOOG  720.92       100  720.50  720.93
4 2016-05-25 13:30:00.048   AAPL   98.00       100     NaN     NaN

We only asof within 2ms between the quote time and the trade time

This example is valid syntax, but we were not able to check execution
>>> pd.merge_asof(
...  trades, quotes, on="time", by="ticker", tolerance=pd.Timedelta("2ms")
... ) time ticker price quantity bid ask 0 2016-05-25 13:30:00.023 MSFT 51.95 75 51.95 51.96 1 2016-05-25 13:30:00.038 MSFT 51.95 155 NaN NaN 2 2016-05-25 13:30:00.048 GOOG 720.77 100 720.50 720.93 3 2016-05-25 13:30:00.048 GOOG 720.92 100 720.50 720.93 4 2016-05-25 13:30:00.048 AAPL 98.00 100 NaN NaN

We only asof within 10ms between the quote time and the trade time and we exclude exact matches on time. However prior data will propagate forward

This example is valid syntax, but we were not able to check execution
>>> pd.merge_asof(
...  trades,
...  quotes,
...  on="time",
...  by="ticker",
...  tolerance=pd.Timedelta("10ms"),
...  allow_exact_matches=False
... ) time ticker price quantity bid ask 0 2016-05-25 13:30:00.023 MSFT 51.95 75 NaN NaN 1 2016-05-25 13:30:00.038 MSFT 51.95 155 51.97 51.98 2 2016-05-25 13:30:00.048 GOOG 720.77 100 NaN NaN 3 2016-05-25 13:30:00.048 GOOG 720.92 100 NaN NaN 4 2016-05-25 13:30:00.048 AAPL 98.00 100 NaN NaN
See :

Back References

The following pages refer to to this document either explicitly or contain code examples using this.

pandas.core.indexes.base.Index.asof pandas.core.reshape.merge.merge_ordered pandas.core.generic.NDFrame.asof pandas.core.reshape.merge.merge pandas.core.frame.DataFrame.merge

Local connectivity graph

Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.

Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)

SVG is more flexible but power hungry; and does not scale well to 50 + nodes.

All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them


File: /pandas/core/reshape/merge.py#326
type: <class 'function'>
Commit: