Document

pandas 1.4.2

Notes Parameters Raises Returns

to_sql(self, name: 'str', con, schema=None, if_exists: 'str' = 'fail', index: 'bool_t' = True, index_label=None, chunksize=None, dtype: 'DtypeArg | None' = None, method=None) -> 'int | None'

Databases supported by SQLAlchemy are supported. Tables can be newly created, appended to, or overwritten.

Notes

Timezone aware datetime columns will be written as Timestamp with timezone type with SQLAlchemy if supported by the database. Otherwise, the datetimes will be stored as timezone unaware timestamps local to the original timezone.

Parameters

name : str

Name of SQL table.

con : sqlalchemy.engine.(Engine or Connection) or sqlite3.Connection

Using SQLAlchemy makes it possible to use any DB supported by that library. Legacy support is provided for sqlite3.Connection objects. The user is responsible for engine disposal and connection closure for the SQLAlchemy connectable See here.

schema : str, optional

Specify the schema (if database flavor supports this). If None, use default schema.

if_exists : {'fail', 'replace', 'append'}, default 'fail'

How to behave if the table already exists.

fail: Raise a ValueError.
replace: Drop the table before inserting new values.
append: Insert new values to the existing table.

index : bool, default True

Write DataFrame index as a column. Uses index_label as the column name in the table.

index_label : str or sequence, default None

Column label for index column(s). If None is given (default) and :None:None:`index` is True, then the index names are used. A sequence should be given if the DataFrame uses MultiIndex.

chunksize : int, optional

Specify the number of rows in each batch to be written at a time. By default, all rows will be written at once.

dtype : dict or scalar, optional

Specifying the datatype for columns. If a dictionary is used, the keys should be the column names and the values should be the SQLAlchemy types or strings for the sqlite3 legacy mode. If a scalar is provided, it will be applied to all columns.

method : {None, 'multi', callable}, optional

Controls the SQL insertion clause used:

Raises

ValueError: When the table already exists and :None:None:`if_exists` is 'fail' (the default).

Returns

None or int

Number of rows affected by to_sql. None is returned if the callable passed into method does not return the number of rows.

The number of returned rows affected is the sum of the rowcount attribute of sqlite3.Cursor or SQLAlchemy connectable which may not reflect the exact number of written rows as stipulated in the sqlite3 or SQLAlchemy.

versionadded

Write records stored in a DataFrame to a SQL database.

See Also

read_sql: Read a DataFrame from a table.

Examples

Create an in-memory SQLite database.

This example is valid syntax, but we were not able to check execution

>>> from sqlalchemy import create_engine
... engine = create_engine('sqlite://', echo=False)

Create a table from scratch with 3 rows.

This example is valid syntax, but we were not able to check execution

>>> df = pd.DataFrame({'name' : ['User 1', 'User 2', 'User 3']})
... df
     name
0  User 1
1  User 2
2  User 3

This example is valid syntax, but we were not able to check execution

>>> df.to_sql('users', con=engine)
3

This example is valid syntax, but we were not able to check execution

>>> engine.execute("SELECT * FROM users").fetchall()
[(0, 'User 1'), (1, 'User 2'), (2, 'User 3')]

An :None:None:`sqlalchemy.engine.Connection` can also be passed to :None:None:`con`:

This example is valid syntax, but we were not able to check execution

>>> with engine.begin() as connection:
...     df1 = pd.DataFrame({'name' : ['User 4', 'User 5']})
...     df1.to_sql('users', con=connection, if_exists='append')
2

This is allowed to support operations that require that the same DBAPI connection is used for the entire operation.

This example is valid syntax, but we were not able to check execution

>>> df2 = pd.DataFrame({'name' : ['User 6', 'User 7']})
... df2.to_sql('users', con=engine, if_exists='append')
2

This example is valid syntax, but we were not able to check execution

>>> engine.execute("SELECT * FROM users").fetchall()
[(0, 'User 1'), (1, 'User 2'), (2, 'User 3'),
 (0, 'User 4'), (1, 'User 5'), (0, 'User 6'),
 (1, 'User 7')]

Overwrite the table with just df2 .

This example is valid syntax, but we were not able to check execution

>>> df2.to_sql('users', con=engine, if_exists='replace',
...            index_label='id')
2

This example is valid syntax, but we were not able to check execution

>>> engine.execute("SELECT * FROM users").fetchall()
[(0, 'User 6'), (1, 'User 7')]

Specify the dtype (especially useful for integers with missing values). Notice that while pandas is forced to store the data as floating point, the database supports nullable integers. When fetching the data with Python, we get back integer scalars.

This example is valid syntax, but we were not able to check execution

>>> df = pd.DataFrame({"A": [1, None, 2]})
... df
     A
0  1.0
1  NaN
2  2.0

This example is valid syntax, but we were not able to check execution

>>> from sqlalchemy.types import Integer
... df.to_sql('integers', con=engine, index=False,
...           dtype={"A": Integer()})
3

This example is valid syntax, but we were not able to check execution

>>> engine.execute("SELECT * FROM integers").fetchall()
[(1,), (None,), (2,)]

See :

Local connectivity graph

Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.

Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)

SVG is more flexible but power hungry; and does not scale well to 50 + nodes.

All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them

File: /pandas/core/generic.py#2781
type: <class 'function'>
Commit: