compressers

Has two main parts:

  1. The compresser classes (all BaseCompresser subclasses)

  2. The compresser registry.

The compresser classes must provide a minimum signature defined in the the common base class: BaseCompresser, and are in charge of opening, closing and getting the byte streams into which to write (or from which to read) the binary serialized representation of arbitrary python objects.

registry.get_compresser(compression)

Get the compresser class registered with a given compression name.

registry.get_compresser_from_extension(extension)

Get the compresser class registered with a given file extension.

registry.get_compression_from_extension(...)

Get the compression name registered with a given file extension.

registry.register_compresser(compression, ...)

Register a compression method, along with its compresser class, extensions and modes.

registry.get_compression_write_mode(compression)

Get the compression's default mode for opening the file buffer for writing.

registry.get_compression_read_mode(compression)

Get the compression's default mode for opening the file buffer for reading.

registry.add_compression_alias(alias, ...)

Add an alias for an already registered compression.

registry.get_known_compressions()

Get a list of known compression protocols

registry.validate_compression(compression[, ...])

Check if the supplied compression protocol is supported.

registry.get_default_compression_mapping()

Get a mapping from known compression protocols to the default filename extensions.

registry.list_registered_compressers()

Get the list of registered compresser classes.

base.BaseCompresser(path, mode, **kwargs)

Compresser abstract base class.

no_compression.NoCompresser(path, mode, **kwargs)

Compresser class that represents a simple uncompressed file object.

gzip.GzipCompresser(path, mode, **kwargs)

Compresser class that wraps the gzip compression package.

bz2.Bz2Compresser(path, mode, **kwargs)

Compresser class that wraps the bz2 compression package.

lzma.LzmaCompresser(path, mode, **kwargs)

Compresser class that wraps the lzma compression package.

zipfile.ZipfileCompresser(path, mode, *[, ...])

Compresser class that wraps the zipfile compression package.

lz4.Lz4Compresser(path, mode, **kwargs)

Compresser class that wraps the lz4 compression package.

Registry

compress_pickle.compressers.registry.add_compression_alias(alias: str, compression: Optional[str])[source]

Add an alias for an already registered compression.

Parameters
  • alias (str) – The alias to register

  • compression (Optional[str]) – The compression name that must already be registered.

Raises

ValueError – If the supplied compression is not known or if the supplied alias is already contained in the registry.

compress_pickle.compressers.registry.get_compresser(compression: Optional[str]) Type[compress_pickle.compressers.base.BaseCompresser][source]

Get the compresser class registered with a given compression name.

Parameters

compression (Optional[str]) – The compression name.

Raises

ValueError – If the supplied compression has not been registered.

Returns

The compresser class associated to the compression name.

Return type

Type[BaseCompresser]

compress_pickle.compressers.registry.get_compresser_from_extension(extension: str) Type[compress_pickle.compressers.base.BaseCompresser][source]

Get the compresser class registered with a given file extension.

Parameters

extension (str) – The file extension, for example “.zip”. Note that the dot characters will be striped from the left of any supplied extension before the lookup it. This means that “.zip” and “zip” will be considered equivalent extensions.

Raises

ValueError – If the supplied extension has not been registered.

Returns

The compresser class associated to the extension.

Return type

Type[BaseCompresser]

compress_pickle.compressers.registry.get_compression_from_extension(extension: str) Optional[str][source]

Get the compression name registered with a given file extension.

Parameters

extension (str) – The file extension, for example “.zip”. Note that the dot characters will be striped from the left of any supplied extension before the lookup it. This means that “.zip” and “zip” will be considered equivalent extensions.

Raises

ValueError – If the supplied extension has not been registered.

Returns

The compression name associated to the extension.

Return type

Optional[str]

compress_pickle.compressers.registry.get_compression_read_mode(compression: Optional[str]) str[source]

Get the compression’s default mode for opening the file buffer for reading.

Parameters

compression (Optional[str]) – The compression name.

Returns

compression_read_mode – The default read mode of the given compression.

Return type

str

Raises

ValueError – If the default write mode of the supplied compression is not known.

compress_pickle.compressers.registry.get_compression_write_mode(compression: Optional[str]) str[source]

Get the compression’s default mode for opening the file buffer for writing.

Parameters

compression (Optional[str]) – The compression name.

Returns

compression_write_mode – The default write mode of the given compression.

Return type

str

Raises

ValueError – If the default write mode of the supplied compression is not known.

compress_pickle.compressers.registry.get_default_compression_mapping() Dict[Optional[str], str][source]

Get a mapping from known compression protocols to the default filename extensions.

Returns

compression_map – Dictionary that maps known compression protocol names to their default file extension.

Return type

Dict[Optional[str], str]

compress_pickle.compressers.registry.get_known_compressions() List[Optional[str]][source]

Get a list of known compression protocols

Returns

compressions – List of known compression protocol names.

Return type

List[Optional[str]]

compress_pickle.compressers.registry.get_registered_extensions() Dict[str, Optional[str]][source]

Get a copy of the mapping between file extensions and registered compressers.

Returns

The mapping between file extensions and registered compressers.

Return type

Dict[str, Optional[str]]

compress_pickle.compressers.registry.list_registered_compressers() List[Type[compress_pickle.compressers.base.BaseCompresser]][source]

Get the list of registered compresser classes.

Returns

The list of registered compresser classes.

Return type

List[Type[BaseCompresser]]

compress_pickle.compressers.registry.register_compresser(compression: Optional[str], compresser: Type[compress_pickle.compressers.base.BaseCompresser], extensions: Sequence[str], default_write_mode: str = 'wb', default_read_mode: str = 'rb')[source]

Register a compression method, along with its compresser class, extensions and modes.

Parameters
  • compression (Optional[str]) – The compression name that will be registered.

  • compresser (Type[BaseCompresser]) – The compresser class. This should be a BaseCompresser subclass.

  • extensions (Sequence[str]) – A sequence of file name extensions (e.g. [“.zip”]) that will be registered to the supplied compression. These extensions will be used to infer the compression method to use from a file name. The first entry in the extensions sequence will be used as the compression’s default extension name. For example, if extensions = ["bz2", "bz"] both the extensions "bz2" and "bz" will be registered to the compression, but "bz2" will be taken as the compression’s default extension. Note that the dot characters will be striped from the left of any supplied extension before registering it. This means that “.zip” and “zip” will be considered equivalent extensions.

  • default_write_mode (str) – The write mode with which to open the file object stream by default.

  • default_read_mode (str) – The read mode with which to open the file object stream by default.

Raises
  • ValueError – If the supplied compression is already contained in the registry or if any of the supplied extensions is already registered.

  • TypeError – If the supplied compresser is not a BaseCompresser subclass.

compress_pickle.compressers.registry.validate_compression(compression: Optional[str], infer_is_valid: bool = True)[source]

Check if the supplied compression protocol is supported.

If it is not supported, a ValueError is raised.

Parameters
  • compression (Optional[str]) – A compression protocol. To see the known compression protocolos, use get_known_compressions()

  • infer_is_valid (bool) – If True, compression="infer" is considered a valid compression protocol. If False, it is not accepted as a valid compression protocol.

Raises

ValueError – If the supplied compression is not supported.

Compressers

class compress_pickle.compressers.base.BaseCompresser(path: Union[str, bytes, os.PathLike, IO[bytes]], mode: str, **kwargs)[source]

Compresser abstract base class.

This class is in charge of handing the binary stream where the pickled python objects must be written to (or read from). During an instance’s initialization, the binary stream must be opened based on a supplied path parameter and mode. This stream is nothing more than a python “file-like” object that is in charge of actually writting and reading the binary contents.

Parameters
  • path (Union[PathType, IO[bytes]]) – A PathType object (str, bytes, os.PathType) or a file-like object (e.g. io.BaseIO instances). The path that will be used to open the input/output binary stream.

  • mode (str) – Mode with which to open the file buffer.

  • kwargs – Any other key word arguments that can be handled by the specific binary stream opener.

abstract close(**kwargs)[source]

Close the input/output binary stream.

This closes the input/output stream that is created during the __init__.

Parameters

kwargs – Any other key word arguments that can be handled by the specific binary stream closer.

abstract get_stream() IO[bytes][source]

Get the input/output binary stream (i.e. a file-like object).

Returns

The input/output binary stream where the pickled objects are written to or read from.

Return type

IO[bytes]

class compress_pickle.compressers.no_compression.NoCompresser(path: Union[str, bytes, os.PathLike, IO[bytes]], mode: str, **kwargs)[source]

Compresser class that represents a simple uncompressed file object.

This class either simply calls open on the supplied path or uses the path as the binary input/output stream.

Parameters
  • path (Union[PathType, IO[bytes]]) – A PathType object (str, bytes, os.PathType) or a file-like object (e.g. io.BaseIO instances). The path that will be used to open the input/output binary stream.

  • mode (str) – Mode with which to open the file buffer.

  • kwargs – Any other key word arguments that are passed to open().

close()[source]

Close the input/output stream if necessary.

This will only close the stream if the path argument passed during construction was a PathType (str, bytes, os.PathType). Otherwise, this method doesn’t do anything.

get_stream() IO[bytes][source]

Get the input/output binary stream (i.e. a file-like object).

Returns

The input/output binary stream where the pickled objects are written to or read from.

Return type

IO[bytes]

class compress_pickle.compressers.gzip.GzipCompresser(path: Union[str, bytes, os.PathLike, IO[bytes]], mode: str, **kwargs)[source]

Compresser class that wraps the gzip compression package.

This class relies on the gzip module to open the input/output binary stream where the pickled python objects will be written to (or read from). During an instance’s initialization, the binary stream is opened using gzip.open(path, mode=mode, **kwargs).

Parameters
  • path (Union[PathType, IO[bytes]]) – A PathType object (str, bytes, os.PathType) or a file-like object (e.g. io.BaseIO instances). The path that will be used to open the input/output binary stream.

  • mode (str) – Mode with which to open the file buffer.

  • kwargs – Any other key word arguments that are passed to gzip.open().

close()[source]

Close the input/output binary stream.

This closes the input/output stream that is created during the __init__.

Parameters

kwargs – Any other key word arguments that can be handled by the specific binary stream closer.

get_stream() IO[bytes][source]

Get the input/output binary stream (i.e. a file-like object).

Returns

The input/output binary stream where the pickled objects are written to or read from.

Return type

IO[bytes]

class compress_pickle.compressers.bz2.Bz2Compresser(path: Union[str, bytes, os.PathLike, IO[bytes]], mode: str, **kwargs)[source]

Compresser class that wraps the bz2 compression package.

This class relies on the bz2 module to open the input/output binary stream where the pickled python objects will be written to (or read from). During an instance’s initialization, the binary stream is opened using bz2.open(path, mode=mode, **kwargs).

Parameters
  • path (Union[PathType, IO[bytes]]) – A PathType object (str, bytes, os.PathType) or a file-like object (e.g. io.BaseIO instances). The path that will be used to open the input/output binary stream.

  • mode (str) – Mode with which to open the file buffer.

  • kwargs – Any other key word arguments that are passed to bz2.open().

close()[source]

Close the input/output binary stream.

This closes the input/output stream that is created during the __init__.

Parameters

kwargs – Any other key word arguments that can be handled by the specific binary stream closer.

get_stream() IO[bytes][source]

Get the input/output binary stream (i.e. a file-like object).

Returns

The input/output binary stream where the pickled objects are written to or read from.

Return type

IO[bytes]

class compress_pickle.compressers.lzma.LzmaCompresser(path: Union[str, bytes, os.PathLike, IO[bytes]], mode: str, **kwargs)[source]

Compresser class that wraps the lzma compression package.

This class relies on the lzma module to open the input/output binary stream where the pickled python objects will be written to (or read from). During an instance’s initialization, the binary stream is opened using lzma.open(path, mode=mode, **kwargs).

Parameters
  • path (Union[PathType, IO[bytes]]) – A PathType object (str, bytes, os.PathType) or a file-like object (e.g. io.BaseIO instances). The path that will be used to open the input/output binary stream.

  • mode (str) – Mode with which to open the file buffer.

  • kwargs – Any other key word arguments that are passed to lzma.open().

close()[source]

Close the input/output binary stream.

This closes the input/output stream that is created during the __init__.

Parameters

kwargs – Any other key word arguments that can be handled by the specific binary stream closer.

get_stream() IO[bytes][source]

Get the input/output binary stream (i.e. a file-like object).

Returns

The input/output binary stream where the pickled objects are written to or read from.

Return type

IO[bytes]

class compress_pickle.compressers.zipfile.ZipfileCompresser(path: Union[str, bytes, os.PathLike, IO[bytes]], mode: str, *, arcname=None, pwd=None, zipfile_compression=None, **kwargs)[source]

Compresser class that wraps the zipfile compression package.

This class relies on the zipfile module to open the input/output binary stream where the pickled python objects will be written to (or read from). During an instance’s initialization, a zipfile.ZipFile instance is created around the supplied path. The opened ZipFile is called the archive and works as a kind directory, that can hold other directories or files. These are called members of the archive. The ZipfileCompresser creates the input/output stream by opening a member file in the opened ZipFile archive. The name of the archive member can be chosen with the arcname argument.

Parameters
  • path (Union[PathType, IO[bytes]]) – A PathType object (str, bytes, os.PathType) or a file-like object (e.g. io.BaseIO instances). The path that will be used to open the zipfile.Zipfile object. The input/output binary stream is then opened from the aforementioned object as zipfile.Zipfile(*).open(arcname, mode).

  • mode (str) – Mode with which to open the Zipfile (and also the archive member that is used as the input/output binary stream).

  • arcname (Optional[str]) – The name of the archive member of the opened zipfile.Zipfile that will be used as the binary input/output stream. If None, the arcname is assumed to be the basename of path (when path is path-like), path.name (when path is file-like and it has a name attribute) or “default” otherwise.

  • pwd (Optional[str]) – The password used to decrypt encrypted ZIP files.

  • zipfile_compression (Optional[str]) – If not None, it is passed as the compression keyword argument to zipfile.Zipfile(...).

  • kwargs – Any other key word arguments that are passed to zipfile.ZipFile.

close()[source]

Close the input/output binary stream and the ZipFile.

This closes the zipfile.ZipFile instance and archive member file-objects that are created during the __init__.

get_stream() IO[bytes][source]

Get the input/output binary stream (i.e. a file-like object).

Returns

The input/output binary stream where the pickled objects are written to or read from.

Return type

IO[bytes]

class compress_pickle.compressers.lz4.Lz4Compresser(path: Union[str, bytes, os.PathLike, IO[bytes]], mode: str, **kwargs)[source]

Compresser class that wraps the lz4 compression package.

This class relies on the lz4 module to open the input/output binary stream where the pickled python objects will be written to (or read from). During an instance’s initialization, the binary stream is opened using lz4.frame.open(path, mode=mode, **kwargs).

Parameters
  • path (Union[PathType, IO[bytes]]) – A PathType object (str, bytes, os.PathType) or a file-like object (e.g. io.BaseIO instances). The path that will be used to open the input/output binary stream.

  • mode (str) – Mode with which to open the file buffer.

  • kwargs – Any other key word arguments that are passed to lz4.frame.open().

close()[source]

Close the input/output binary stream.

This closes the input/output stream that is created during the __init__.

Parameters

kwargs – Any other key word arguments that can be handled by the specific binary stream closer.

get_stream() IO[bytes][source]

Get the input/output binary stream (i.e. a file-like object).

Returns

The input/output binary stream where the pickled objects are written to or read from.

Return type

IO[bytes]