corelay.io
A sub-package containing IO-related modules for storing intermediate results of operations performed by instances of
Processor. This can be used in a Pipeline to prevent the re-computation of
intermediate results needed multiple times, or as a cache for subsequent runs of the same pipeline.
- exception corelay.io.NoDataSource[source]
Bases:
ExceptionAn exception, which is raised when no data source available.
- exception corelay.io.NoDataTarget[source]
Bases:
ExceptionAn exception, which is raised when no target source available.
- __init__() None[source]
Initializes a new
NoDataTargetinstance.- Return type:
None
- class corelay.io.DataStorageBase[source]
-
The abstract base class for key-value stores.
- __bool__() bool[source]
Converts the data storage object to a
boolvalue. This is used to determine if the data storage object is actually backed by a store.
- __enter__() DataStorageBase[source]
Opens the IO object and returns the instance. This is used to implement the context manager protocol, which allows the use of the with statement to automatically close the IO object when it is no longer needed. This is useful for ensuring that the IO object is properly closed and resources are released when the context manager exits.
- Returns:
Returns this instance of the
DataStorageBaseclass.- Return type:
- __exit__(exception_type: type[Exception] | None, exception: Exception, traceback: TracebackType | None) None[source]
Closes the IO object. This is used to implement the context manager protocol, which allows the use of the with statement to automatically close the IO object when it is no longer needed. This is useful for ensuring that the IO object is properly closed and resources are released when the context manager exits.
- Parameters:
exception_type (type[Exception] | None) – When the context manager exits due to an exception, this is the type of the exception that was raised, otherwise it is
None.exception (Exception) – When the context manager exits due to an exception, this is the exception that was raised, otherwise it is
None.traceback (types.TracebackType | None) – When the context manager exits due to an exception, this is the traceback of the exception that was raised, otherwise it is
None.
- Return type:
None
- __init__(**kwargs: Any) None[source]
Initializes a new
DataStorageBaseinstance.
- __tracked__: OrderedDict[str, Any]
An
collections.OrderedDictwith all public class attributes, i.e., all class attributes not enclosed with double underscores.
- at(**kwargs: Any) DataStorageBase[source]
Returns a copy of the instance where the keyword arguments were added as attributes of the class become the attributes of the class.
- Parameters:
**kwargs (Any) – The keyword arguments, which are added as attributes of the class.
- Raises:
TypeError – One or more of the names in the keyword arguments are not valid attribute names.
- Returns:
Returns a copy of the instance where the keyword arguments were added as attributes of the class become the attributes of the class. This allows to create a new instance of the class with new or updated attributes without modifying the original instance.
- Return type:
- abstractmethod keys() KeysView[str][source]
Retrieves the keys of the data stored in the storage container.
- Returns:
Returns a list of keys of the IO file object.
- Return type:
- abstractmethod read(data_in: Any = None, meta: Any = None) Any[source]
Reads the output data that was produced by the specified input data, if it is available. The metadata can contain additional identifying information about the data.
- Parameters:
- Raises:
NoDataSource – The data source is not available.
- Returns:
Returns the data that was produced by the specified input data if it is available.
- Return type:
- class corelay.io.NoStorage[source]
Bases:
DataStorageBaseA placeholder data storage class, which does not actually use persistent storage and raises exceptions when trying to read from it or write to it.
- __bool__() bool[source]
Converts the data storage object to a
boolvalue. This is used to determine if the data storage object is actually backed by a store.
- __tracked__: OrderedDict[str, Any]
An
collections.OrderedDictwith all public class attributes, i.e., all class attributes not enclosed with double underscores.
- exists() bool[source]
Returns True if data exists.
- Raises:
NoDataSource – This is a placeholder data storage class and does not actually use persistent storage and therefore always raises this exception.
- Returns:
Returns
Falsesince this is a placeholder data storage class and does not actually use persistent storage.- Return type:
- keys() KeysView[str][source]
Retrieves the keys of the data stored in the storage container.
- Raises:
NoDataSource – This is a placeholder data storage class and does not actually use persistent storage and therefore always raises this exception.
- Returns:
Returns never, since this is a placeholder data storage class that does not actually use persistent storage and raises an exception.
- Return type:
- read(data_in: Any = None, meta: Any = None) Any[source]
Reads the output data that was produced by the specified input data, if it is available. The metadata can contain additional identifying information about the data.
- Parameters:
- Raises:
NoDataSource – This is a placeholder data storage class and does not actually use persistent storage and therefore always raises this exception.
- Returns:
Returns the data that was produced by the specified input data if it is available.
- Return type:
- write(data_out: Any, data_in: Any = None, meta: Any = None) None[source]
Writes the specified output data to the storage. The metadata can be used to store additional identifying information about the data.
- Parameters:
- Raises:
NoDataTarget – This is a placeholder data storage class and does not actually use persistent storage and therefore always raises this exception.
- Return type:
None
- class corelay.io.PickleStorage[source]
Bases:
DataStorageBaseExperimental pickle storage that uses the
picklemodule to store data.- __init__(path: str | Path, mode: str = 'r', data_key: str | None = None, **kwargs: Any) None[source]
Initializes a new
PickleStorageinstance.- Parameters:
path (str | pathlib.Path) – The path to the pickle file where the data is to read from or written to.
mode (str) – The mode in which the file is opened. This can be either “w” for write mode, “r” for read mode or “a” for append mode. In write mode, the file is created if it does not exist and the existing file is overwritten. In read mode, the file must already exist and the data is read from the file. In append mode, the file is created if it does not exist and the data is appended to the end of the file. Defaults to “r”.
data_key (str | None) – The key of the data that is read from the pickle file or written to the pickle file. Defaults to
None.**kwargs (Any) – Keyword arguments that are passed to the constructor of the class one step up in the class hierarchy, i.e.,
DataStorageBase.
- Raises:
ValueError – The mode is not “w”, “r”, or “a”.
- Return type:
None
- __tracked__: OrderedDict[str, Any]
An
collections.OrderedDictwith all public class attributes, i.e., all class attributes not enclosed with double underscores.
- data_key: Annotated[str, Param]
The key of the data that is read from the pickle file or written to the pickle file.
- keys() KeysView[str][source]
Retrieves the keys of the data stored in the pickle file.
- Returns:
Returns a view of keys of the data that is stored in the file.
- Return type:
- read(data_in: Any = None, meta: Any = None) Any[source]
Retrieves the data for a given data key.
- Parameters:
- Raises:
NoDataSource – The data source for the given data key does not exist.
- Returns:
Returns the data for the given data key.
- Return type:
- write(data_out: Any, data_in: Any = None, meta: Any = None) None[source]
Writes the specified output data to the pickle file using the given data key as: {‘data’: data_out, ‘key’: self.data_key}.
- io: IO[Any]
The file object to read data from and write data to. This is a binary file object that is used to store the pickled data.
- data: dict[str, Any]
A
dictthat stores the data that is read from or written to the file. The keys of thedictare the keys of the data that is stored in the file, and the values are the data that is stored in the file. Thedictis used to cache the data that is read from the file, so that it does not need to be read from the file again if it is already cached.
- class corelay.io.HDF5Storage[source]
Bases:
DataStorageBaseA storage that used HDF5 files to store data.
- __init__(path: str | Path, mode: str = 'r', data_key: str | None = None, **kwargs: Any) None[source]
Initializes a new
HDF5Storageinstance.- Parameters:
path (str | pathlib.Path) – The path to the HDF5 file where the data is to read from or written to.
mode (str) – The mode to open the HDF5 file in. This can be either “w” for write mode, “r” for read mode or “a” for append mode. In write mode, the file is created if it does not exist and existing files will be overwritten. In read mode, the file must already exist and the data is read from the file. In append mode, the file is created if it does not exist and the data is appended to the end of the file if the file already exists. Defaults to “r”.
data_key (str | None) – The key of the data that is read from the HDF5 file or written to the HDF5 file. Defaults to
None.**kwargs (Any) – Keyword arguments that are passed to the constructor of the class one step up in the class hierarchy, i.e.,
DataStorageBase.
- Raises:
ValueError – The mode is not “w”, “r”, or “a”.
- Return type:
None
- __tracked__: OrderedDict[str, Any]
An
collections.OrderedDictwith all public class attributes, i.e., all class attributes not enclosed with double underscores.
- data_key: Annotated[str, Param]
The key of the data that is read from the HDF5 file or written to the HDF5 file.
- keys() KeysView[str][source]
Retrieves the keys of the data stored in the HDF5 file.
- Returns:
Returns a view of keys of the data in the HDF5 file.
- Return type:
- read(data_in: Any = None, meta: Any = None) Any[source]
Retrieves the data for a given data key.
- Parameters:
- Raises:
NoDataSource – The data source for the given data key does not exist.
- Returns:
Returns the data for the given data key.
- Return type:
- write(data_out: dict[str, Any] | tuple[Any, ...] | Any, data_in: Any = None, meta: Any = None) None[source]
Writes the specified output data to the HDF5 file. If the output data is a
dict, then the output data is stored in an HDF5 group with the name given by the data key. The key-value pairs of thedictwill be stored in this HDF5 group with the keys of thedictused as the names of the datasets and the values of thedictused as the data for the datasets. If the output data is a tuple, then the output data is stored in an HDF5 group with the name given by the data key. The values of the tuple will be stored as datasets in this HDF5 group, with the indices of the tuple used as the names of the datasets and the values of the tuple used as the data for the datasets. If the output data is neither adictnor a tuple, then the output data is stored in an HDF5 dataset with the name given by the data key and the output data used as the data for the dataset.- Parameters:
data_out (dict[str, Any] | tuple[Any, ...] | Any) – The data to write to the HDF5 file. This can either be a dataset, a tuple, or any value that can be written to an HDF5 file (i.e., basic data types like
int,float,bool, orstr, or andarray). If the data is adict, then it will be stored as an HDF5 group with the name given by the data key. The key-value pairs of thedictwill be stored in this HDF5 group with the keys of thedictused as the names of the datasets and the values of thedictused as the data for the datasets. If the data is a tuple, then it will be stored as an HDF5 group with the name given by the data key. The values of the tuple will be stored as datasets in this HDF5 group, with the indices of the tuple used as the names of the datasets and the values of the tuple used as the data for the datasets. If the data is neither adictnor a tuple, then it will be stored in an HDF5 dataset with the name given by the data key and the data used as the data for the dataset.data_in (Any) – The input data that produced the output data. Defaults to
None.meta (Any) – The metadata that can be used to store additional identifying information about the data. Defaults to
None.
- Return type:
None
Modules
A module that contains non-cryptographic hashing functionality for Python objects. |
|
A module that contains classes to read and write intermediate results of operations performed by instances of |