Document

read_parquet(path, engine: 'str' = 'auto', columns=None, storage_options: 'StorageOptions' = None, use_nullable_dtypes: 'bool' = False, **kwargs) -> 'DataFrame'

Parameters

path : str, path object or file-like object: String, path object (implementing os.PathLike[str] ), or file-like object implementing a binary read() function. The string could be a URL. Valid URL schemes include http, ftp, s3, gs, and file. For file URLs, a host is expected. A local file could be: file://localhost/path/to/table.parquet . A file URL can also be a path to a directory that contains multiple partitioned parquet files. Both pyarrow and fastparquet support paths to directories as well as file URLs. A directory path could be: file://localhost/path/to/tables or s3://bucket/partition_dir .
engine : {'auto', 'pyarrow', 'fastparquet'}, default 'auto': Parquet library to use. If 'auto', then the option io.parquet.engine is used. The default io.parquet.engine behavior is to try 'pyarrow', falling back to 'fastparquet' if 'pyarrow' is unavailable.
columns : list, default=None: If not None, only these columns will be read from the file.
storage_options : dict, optional: Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to urllib as header options. For other URLs (e.g. starting with "s3://", and "gcs://") the key-value pairs are forwarded to fsspec . Please see fsspec and urllib for more details.

versionadded
use_nullable_dtypes : bool, default False: If True, use dtypes that use pd.NA as missing value indicator for the resulting DataFrame. (only applicable for the pyarrow engine) As new dtypes are added that support pd.NA in the future, the output with this option will change to use those dtypes. Note: this is an experimental option, and behaviour (e.g. additional support dtypes) may change without notice.

versionadded
**kwargs :: Any additional kwargs are passed to the engine.

Returns

DataFrame

Load a parquet object from the file path, returning a DataFrame.

Examples

See :

Back References

The following pages refer to to this document either explicitly or contain code examples using this.

pandas.io.pickle.read_pickle pandas.core.frame.DataFrame.to_parquet

Local connectivity graph

Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.

Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)

SVG is more flexible but power hungry; and does not scale well to 50 + nodes.

All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them

File: /pandas/io/parquet.py#437
type: <class 'function'>
Commit: