To remove in the future –– pandas
pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language. It is already well on its way toward this goal.
Here are just a few of the things that pandas does well:
Easy handling of missing data in floating point as well as non-floating point data.
Size mutability: columns can be inserted and deleted from DataFrame and higher dimensional objects
Automatic and explicit data alignment: objects can be explicitly aligned to a set of labels, or the user can simply ignore the labels and let
Series
,DataFrame
, etc. automatically align the data for you in computations.Powerful, flexible group by functionality to perform split-apply-combine operations on data sets, for both aggregating and transforming data.
Make it easy to convert ragged, differently-indexed data in other Python and NumPy data structures into DataFrame objects.
Intelligent label-based slicing, fancy indexing, and subsetting of large data sets.
Intuitive merging and joining data sets.
Flexible reshaping and pivoting of data sets.
Hierarchical labeling of axes (possible to have multiple labels per tick).
Robust IO tools for loading data from flat files (CSV and delimited), Excel files, databases, and saving/loading data from the ultrafast HDF5 format.
Time series-specific functionality: date range generation and frequency conversion, moving window statistics, date shifting and lagging.
The following pages refer to to this document either explicitly or contain code examples using this.
pandas.core.dtypes.base.register_extension_dtype
pandas.core.nanops.nanall
pandas.core.nanops.nankurt
pandas.core.dtypes.concat.union_categoricals
pandas._libs.tslibs.offsets.BusinessMonthEnd
pandas.core.nanops.nansem
pandas.core.nanops.nanvar
pandas._testing.asserters.assert_frame_equal
pandas.core.algorithms.take
pandas.io.stata.StataWriterUTF8
pandas._libs.tslibs.offsets.BusinessMonthBegin
pandas.core.arrays.sparse.array.SparseArray
pandas.core.nanops.nanany
pandas._libs.tslibs.offsets.BQuarterBegin
pandas.core.generic.NDFrame.resample
pandas.core.dtypes.common.is_extension_array_dtype
pandas._testing.asserters.assert_series_equal
pandas._libs.tslibs.offsets.DateOffset
pandas.core.nanops.nanmean
pandas.core.nanops.nanargmax
pandas.core.series.Series.resample
pandas.core.nanops.nanargmin
pandas.core.generic.NDFrame.to_json
pandas._testing.asserters.assert_extension_array_equal
pandas.core.generic.NDFrame.astype
pandas.io.formats.format.get_series_repr_params
pandas._testing.asserters.assert_index_equal
pandas.core.frame.DataFrame.resample
pandas._libs.tslibs.offsets.BYearEnd
pandas.core.nanops.nanstd
pandas.core.dtypes.inference.is_number
pandas.core.nanops.nanprod
pandas.core.nanops.nanskew
pandas.io.stata.StataWriter117
pandas.io.formats.format.get_dataframe_repr_params
pandas._libs.tslibs.offsets.BYearBegin
pandas._libs.tslibs.offsets.BQuarterEnd
pandas.core.nanops.nansum
pandas.core.nanops.nanmedian
Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.
Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)
SVG is more flexible but power hungry; and does not scale well to 50 + nodes.
All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them