corelay.processor.distance
A module that contains processors for pair-wise distance metrics.
Classes
The abstract base class for distance processors. |
|
A distance metric, that computes the pair-wise distance between observations in n-dimensional space using |
- class corelay.processor.distance.Distance[source]
Bases:
ProcessorThe abstract base class for distance processors.
- Parameters:
is_output (bool) – A value indicating whether this
Distanceprocessor is the output of aPipeline. Defaults toFalse.is_checkpoint (bool | None) – A value indicating whether check-pointed pipeline computations should start at this point, if there exists a previously computed checkpoint value. Defaults to
False.io (Storable | None) – An IO object that is used to cache intermediate results of the
Pipeline, which can then be re-used in this run or in subsequent runs of thePipeline. Defaults to an instance ofNoStorage.
- __tracked__: collections.OrderedDict[str, Any]
An
collections.OrderedDictwith all public class attributes, i.e., all class attributes not enclosed with double underscores.
- class corelay.processor.distance.SciPyPDist[source]
Bases:
DistanceA distance metric, that computes the pair-wise distance between observations in n-dimensional space using
scipy.spatial.distance.pdist().- Parameters:
is_output (bool) – A value indicating whether this
SciPyPDistdistance processor is the output of aPipeline. Defaults toFalse.is_checkpoint (bool | None) – A value indicating whether check-pointed pipeline computations should start at this point, if there exists a previously computed checkpoint value. Defaults to
False.io (Storable | None) – An IO object that is used to cache intermediate results of the
Pipeline, which can then be re-used in this run or in subsequent runs of thePipeline. Defaults to an instance ofNoStorage.metric (str) – The distance metric to use. Default is “euclidean”.
m_kwargs (dict) – Additional keyword arguments to pass to the distance function.
- metric: Annotated[str, Param]
The distance metric to use. Can be one of
“braycurtis”
“canberra”
“chebychev”, “chebyshev”, “cheby”, “cheb”, “ch”
“cityblock”, “cblock”, “cb”, “c”
“correlation”, “co”
“cosine”, “cos”
“dice”
“euclidean”, “euclid”, “eu”, “e”
“hamming”, “hamm”, “ha”, “h”
“minkowski”, “mi”, “m”
“pnorm”
“jaccard”, “jacc”, “ja”, “j”
“jensenshannon”, “js”
“mahalanobis”, “mahal”, “mah”
“rogerstanimoto”
“russellrao”
“seuclidean”, “se”, “s”
“sokalsneath”
“sqeuclidean”, “sqe”, “sqeuclid”
“yule”
Defaults to “euclidean”.
- __tracked__: collections.OrderedDict[str, Any]
An
collections.OrderedDictwith all public class attributes, i.e., all class attributes not enclosed with double underscores.
- m_kwargs: Annotated[dict[str, Any], Param]
Additional keyword arguments to pass to the distance function.
- function(data: Any) Any[source]
Applies the pairwise distance function to the input data.
- Parameters:
data (Any) – The input data that is to be processed. The input data should be a NumPy array of shape (number_of_samples, number_of_features).
- Raises:
ValueError – The distance metric is not valid.
- Returns:
Returns the pairwise distance matrix of shape (number_of_samples, number_of_samples).
- Return type: