Document

to_stata(self, path: 'FilePath | WriteBuffer[bytes]', convert_dates: 'dict[Hashable, str] | None' = None, write_index: 'bool' = True, byteorder: 'str | None' = None, time_stamp: 'datetime.datetime | None' = None, data_label: 'str | None' = None, variable_labels: 'dict[Hashable, str] | None' = None, version: 'int | None' = 114, convert_strl: 'Sequence[Hashable] | None' = None, compression: 'CompressionOptions' = 'infer', storage_options: 'StorageOptions' = None, *, value_labels: 'dict[Hashable, dict[float | int, str]] | None' = None) -> 'None'

Writes the DataFrame to a Stata dataset file. "dta" files contain a Stata dataset.

Parameters

path : str, path object, or buffer

String, path object (implementing os.PathLike[str] ), or file-like object implementing a binary write() function.

versionchanged

Previously this was "fname"

convert_dates : dict

Dictionary mapping columns containing datetime types to stata internal format to use when writing the dates. Options are 'tc', 'td', 'tm', 'tw', 'th', 'tq', 'ty'. Column can be either an integer or a name. Datetime columns that do not have a conversion type specified will be converted to 'tc'. Raises NotImplementedError if a datetime column has timezone information.

write_index : bool

Write the index to Stata dataset.

byteorder : str

Can be ">", "<", "little", or "big". default is :None:None:`sys.byteorder`.

time_stamp : datetime

A datetime to use as file creation date. Default is the current time.

data_label : str, optional

A label for the data set. Must be 80 characters or smaller.

variable_labels : dict

Dictionary containing columns as keys and variable labels as values. Each label must be 80 characters or smaller.

version : {114, 117, 118, 119, None}, default 114

Version to use in the output dta file. Set to None to let pandas decide between 118 or 119 formats depending on the number of columns in the frame. Version 114 can be read by Stata 10 and later. Version 117 can be read by Stata 13 or later. Version 118 is supported in Stata 14 and later. Version 119 is supported in Stata 15 and later. Version 114 limits string variables to 244 characters or fewer while versions 117 and later allow strings with lengths up to 2,000,000 characters. Versions 118 and 119 support Unicode characters, and version 119 supports more than 32,767 variables.

Version 119 should usually only be used when the number of variables exceeds the capacity of dta format 118. Exporting smaller datasets in format 119 may have unintended consequences, and, as of November 2020, Stata SE cannot read version 119 files.

versionchanged

Added support for formats 118 and 119.

convert_strl : list, optional

List of column names to convert to string columns to Stata StrL format. Only available if version is 117. Storing strings in the StrL format can produce smaller dta files if strings have more than 8 characters and values are repeated.

compression : str or dict, default 'infer'

For on-the-fly compression of the output data. If 'infer' and 'path' path-like, then detect compression from the following extensions: '.gz', '.bz2', '.zip', '.xz', or '.zst' (otherwise no compression). Set to None for no compression. Can also be a dict with key 'method' set to one of { 'zip' , 'gzip' , 'bz2' , 'zstd' } and other key-value pairs are forwarded to zipfile.ZipFile , gzip.GzipFile , bz2.BZ2File , or zstandard.ZstdDecompressor , respectively. As an example, the following could be passed for faster compression and to create a reproducible gzip archive: compression={'method': 'gzip', 'compresslevel': 1, 'mtime': 1} .

versionadded

versionchanged

storage_options : dict, optional

Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to urllib as header options. For other URLs (e.g. starting with "s3://", and "gcs://") the key-value pairs are forwarded to fsspec . Please see fsspec and urllib for more details.

versionadded

value_labels : dict of dicts

Dictionary containing columns as keys and dictionaries of column value to labels as values. Labels for a single variable must be 32,000 characters or smaller.

versionadded

Raises

NotImplementedError

If datetimes contain timezone information
Column dtype is not representable in Stata

ValueError

Columns listed in convert_dates are neither datetime64[ns] or datetime.datetime
Column listed in convert_dates is not in DataFrame
Categorical label contains more than 32,000 characters

Export DataFrame object to Stata dta format.

Examples

This example is valid syntax, but we were not able to check execution

>>> df = pd.DataFrame({'animal': ['falcon', 'parrot', 'falcon',
...                               'parrot'],
...                    'speed': [350, 18, 361, 15]})
... df.to_stata('animals.dta')  # doctest: +SKIP

See :

Back References

The following pages refer to to this document either explicitly or contain code examples using this.

pandas.io.stata.read_stata

Local connectivity graph

Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.

Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)

SVG is more flexible but power hungry; and does not scale well to 50 + nodes.

All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them

File: /pandas/core/frame.py#2490
type: <class 'function'>
Commit:

Parameters

Raises

See Also

Examples

Back References

Local connectivity graph