info(self, verbose: 'bool | None' = None, buf: 'WriteBuffer[str] | None' = None, max_cols: 'int | None' = None, memory_usage: 'bool | str | None' = None, show_counts: 'bool | None' = None, null_counts: 'bool | None' = None) -> 'None'
This method prints information about a DataFrame including the index dtype and columns, non-null values and memory usage.
DataFrame to print information about.
Whether to print the full summary. By default, the setting in pandas.options.display.max_info_columns
is followed.
Where to send the output. By default, the output is printed to sys.stdout. Pass a writable buffer if you need to further process the output. max_cols : int, optional When to switch from the verbose to the truncated output. If the DataFrame has more than :None:None:`max_cols`
columns, the truncated output is used. By default, the setting in pandas.options.display.max_info_columns
is used.
Specifies whether total memory usage of the DataFrame elements (including the index) should be displayed. By default, this follows the pandas.options.display.memory_usage
setting.
True always show memory usage. False never shows memory usage. A value of 'deep' is equivalent to "True with deep introspection". Memory usage is shown in human-readable units (base-2 representation). Without deep introspection a memory estimation is made based in column dtype and number of rows assuming values consume the same memory amount for corresponding dtypes. With deep memory introspection, a real memory usage calculation is performed at the cost of computational resources.
Whether to show the non-null counts. By default, this is shown only if the DataFrame is smaller than pandas.options.display.max_info_rows
and pandas.options.display.max_info_columns
. A value of True always shows the counts, and False never shows the counts.
Use show_counts instead.
This method prints a summary of a DataFrame and returns None.
Print a concise summary of a DataFrame.
DataFrame.describe
Generate descriptive statistics of DataFrame columns.
DataFrame.memory_usage
Memory usage of DataFrame columns.
>>> int_values = [1, 2, 3, 4, 5]
... text_values = ['alpha', 'beta', 'gamma', 'delta', 'epsilon']
... float_values = [0.0, 0.25, 0.5, 0.75, 1.0]
... df = pd.DataFrame({"int_col": int_values, "text_col": text_values,
... "float_col": float_values})
... df int_col text_col float_col 0 1 alpha 0.00 1 2 beta 0.25 2 3 gamma 0.50 3 4 delta 0.75 4 5 epsilon 1.00
Prints information of all columns:
This example is valid syntax, but we were not able to check execution>>> df.info(verbose=True) <class 'pandas.core.frame.DataFrame'> RangeIndex: 5 entries, 0 to 4 Data columns (total 3 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 int_col 5 non-null int64 1 text_col 5 non-null object 2 float_col 5 non-null float64 dtypes: float64(1), int64(1), object(1) memory usage: 248.0+ bytes
Prints a summary of columns count and its dtypes but not per column information:
This example is valid syntax, but we were not able to check execution>>> df.info(verbose=False) <class 'pandas.core.frame.DataFrame'> RangeIndex: 5 entries, 0 to 4 Columns: 3 entries, int_col to float_col dtypes: float64(1), int64(1), object(1) memory usage: 248.0+ bytes
Pipe output of DataFrame.info to buffer instead of sys.stdout, get buffer content and writes to a text file:
This example is valid syntax, but we were not able to check execution>>> import io
... buffer = io.StringIO()
... df.info(buf=buffer)
... s = buffer.getvalue()
... with open("df_info.txt", "w",
... encoding="utf-8") as f: # doctest: +SKIP
... f.write(s) 260
The memory_usage
parameter allows deep introspection mode, specially useful for big DataFrames and fine-tune memory optimization:
>>> random_strings_array = np.random.choice(['a', 'b', 'c'], 10 ** 6)This example is valid syntax, but we were not able to check execution
... df = pd.DataFrame({
... 'column_1': np.random.choice(['a', 'b', 'c'], 10 ** 6),
... 'column_2': np.random.choice(['a', 'b', 'c'], 10 ** 6),
... 'column_3': np.random.choice(['a', 'b', 'c'], 10 ** 6)
... })
... df.info() <class 'pandas.core.frame.DataFrame'> RangeIndex: 1000000 entries, 0 to 999999 Data columns (total 3 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 column_1 1000000 non-null object 1 column_2 1000000 non-null object 2 column_3 1000000 non-null object dtypes: object(3) memory usage: 22.9+ MB
>>> df.info(memory_usage='deep') <class 'pandas.core.frame.DataFrame'> RangeIndex: 1000000 entries, 0 to 999999 Data columns (total 3 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 column_1 1000000 non-null object 1 column_2 1000000 non-null object 2 column_3 1000000 non-null object dtypes: object(3) memory usage: 165.9 MBSee :
The following pages refer to to this document either explicitly or contain code examples using this.
pandas.core.frame.DataFrame.memory_usage
Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.
Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)
SVG is more flexible but power hungry; and does not scale well to 50 + nodes.
All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them