corelay.io.hashing
A module that contains non-cryptographic hashing functionality for Python objects. These are used to compute hashes of the inputs of operations
performed by instances of Processor to identify them in a way that is independent of their memory address
and can be used to identify data between subsequent runs of the same Pipeline.
Note
Please refer to the Funcache Project to see the original implementation of this module.
Module Attributes
Either the PyTorch |
Functions
Hashes the specified data. |
Classes
A pickler for computing hashes. |
|
Hasher object with a write function for file-like updates |
|
A protocol that defines an interface for objects that can be converted to a |
|
Either the PyTorch |
|
A placeholder class to stand in for PyTorch's |
- class corelay.io.hashing.SupportsConversionToNumPyArray[source]
Bases:
ProtocolA protocol that defines an interface for objects that can be converted to a
ndarray.
- class corelay.io.hashing.TensorPlaceholder[source]
Bases:
objectA placeholder class to stand in for PyTorch’s
Tensorclass in case PyTorch is not installed.
- corelay.io.hashing.Tensor[source]
Either the PyTorch
Tensorclass or the placeholderTensorPlaceholderclass if PyTorch is not installed.Note
This is used to check if an object that is to be pickled is a
Tensoror not, because PyTorchTensorobjects are converted tondarraybefore pickling.
- class corelay.io.hashing.Hasher[source]
Bases:
MetroHash128Hasher object with a write function for file-like updates
- class corelay.io.hashing.HashPickler[source]
Bases:
PicklerA pickler for computing hashes.
- static numpy_id(array: ndarray[Any, Any]) tuple[str, tuple[int, ...], bytes, bytes][source]
Computes a unique ID for a
ndarray, which consists of the data type name, the array’s shape, and the values of the array decomposed into their respective mantissas and exponents as abytessequence.- Parameters:
array (numpy.ndarray[Any, Any]) – The
ndarrayto compute the ID for.- Returns:
Returns a tuple containing the data type name, the array’s shape, and the values of the array decomposed into their respective mantissas and exponents as a
bytessequence.- Return type:
- persistent_id(obj: Any) tuple[str, tuple[int, ...], bytes, bytes] | None[source]
Computes a persistent ID for an object that is to be pickled, which can be used by the
picklemodule to identify two objects as “the same” during the un-pickling process. The persistent ID is used to identify the object in a way that is independent of its memory address. This is useful for caching and serialization purposes.- Parameters:
obj (Any) – The object to compute the persistent ID for.
- Returns:
Returns a persistent ID for the object. If the object is a
ndarray, it returns a tuple containing the data type name, the array’s shape, and the values of the array decomposed into their respective mantissas and exponents as abytessequence. If the object is aTensor, it converts the tensor to andarrayand computes a unique ID for the array. If the object is neither, it returnsNone.- Return type: