To subclass this class effectively you must override the following methods:`
parse_data
_parse_nodes
_parse_doc
_validate_names
_validate_path
See each method's respective documentation for details on their functionality.
Any valid string path is acceptable. The string could be a URL. Valid URL schemes include http, ftp, s3, and file.
The XPath expression to parse required set of nodes for migration to :None:None:`Data Frame`
. :None:None:`etree`
supports limited XPath.
The namespaces defined in XML document (`xmlns:namespace='URI') as dicts with key being namespace and value the URI.
Parse only the child elements at the specified xpath
.
Parse only the attributes at the specified xpath
.
Column names for Data Frame of parsed XML data.
Encoding of xml object or document.
URL, file, file-like object, or a raw string containing XSLT, :None:None:`etree`
does not support XSLT but retained for consistency.
For on-the-fly decompression of on-disk data. If 'infer' and 'path_or_buffer' is path-like, then detect compression from the following extensions: '.gz', '.bz2', '.zip', '.xz', or '.zst' (otherwise no compression). If using 'zip', the ZIP file must contain only one data file to be read in. Set to None
for no decompression. Can also be a dict with key 'method'
set to one of { 'zip'
, 'gzip'
, 'bz2'
, 'zstd'
} and other key-value pairs are forwarded to zipfile.ZipFile
, gzip.GzipFile
, bz2.BZ2File
, or zstandard.ZstdDecompressor
, respectively. As an example, the following could be passed for Zstandard decompression using a custom compression dictionary: compression={'method': 'zstd', 'dict_data': my_compression_dict}
.
Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc.,
Internal subclass to parse XML into DataFrames.
Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.
Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)
SVG is more flexible but power hungry; and does not scale well to 50 + nodes.
All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them