corelay.processor.preprocessing

A module that contains processors for pre-processing data and images.

Classes

Histogram

A Processor that computes channel-wise histograms of the input data.

ImagePreProcessor

The abstract base class for all processors that perform pre-processing on images.

Pooling

A Processor that performs image pooling on the input data.

PreProcessor

The abstract base class for pre-processing processors.

Rescale

A Processor that rescales images by a specified scale factor.

Resize

A Processor that resizes images to a specified width and height.

class corelay.processor.preprocessing.PreProcessor[source]

Bases: Processor

The abstract base class for pre-processing processors.

Parameters:
  • is_output (bool) – A value indicating whether this PreProcessor is the output of a Pipeline. Defaults to False.

  • is_checkpoint (bool | None) – A value indicating whether check-pointed pipeline computations should start at this point, if there exists a previously computed checkpoint value. Defaults to False.

  • io (Storable | None) – An IO object that is used to cache intermediate results of the Pipeline, which can then be re-used in this run or in subsequent runs of the Pipeline. Defaults to an instance of NoStorage.

  • kwargs (dict) – Additional keyword arguments to pass to the pre-processing function. Defaults to an empty dict.

kwargs: Annotated[dict[str, Any], Param]

Additional keyword arguments to pass to the pre-processing function.

Parameters:
Return type:

Plug

__tracked__: collections.OrderedDict[str, Any]

An collections.OrderedDict with all public class attributes, i.e., all class attributes not enclosed with double underscores.

class corelay.processor.preprocessing.Histogram[source]

Bases: PreProcessor

A Processor that computes channel-wise histograms of the input data.

Parameters:
  • is_output (bool) – A value indicating whether this Histogram pre-processor is the output of a Pipeline. Defaults to False.

  • is_checkpoint (bool | None) – A value indicating whether check-pointed pipeline computations should start at this point, if there exists a previously computed checkpoint value. Defaults to False.

  • io (Storable | None) – An IO object that is used to cache intermediate results of the Pipeline, which can then be re-used in this run or in subsequent runs of the Pipeline. Defaults to an instance of NoStorage.

  • kwargs (dict) – Additional keyword arguments to pass to the pre-processing function. Defaults to an empty dict.

  • bins (int) – The number of bins for the histogram. Defaults to 32.

channels_first: Annotated[bool, Param]

A value indicating whether the input data is in channels-first format or not. Defaults to True.

Parameters:
Return type:

Plug

bins: Annotated[int, Param]

Number of bins for the histogram. Defaults to 32.

Parameters:
Return type:

Plug

function(data: Any) Any[source]

Computes channel-wise histograms from the input data.

Parameters:

data (Any) –

The input data to compute histograms for, which is should be a NumPy array of images. The images can be in one of the following formats:

  1. (number_of_samples, number_of_channels, height, width), if channels_first is set to True.

  2. (number_of_samples, height, width, number_of_channels), if channels_first is set to False.

  3. (number_of_samples, height, width).

Returns:

Returns a NumPy array, which contains the channel-wise histograms of the input data as a NumPy array be of shape (number_of_samples, number_of_channels, bins). If the input data is grayscale, then the number of channels will be 1.

Return type:

Any

__tracked__: collections.OrderedDict[str, Any]

An collections.OrderedDict with all public class attributes, i.e., all class attributes not enclosed with double underscores.

class corelay.processor.preprocessing.ImagePreProcessor[source]

Bases: PreProcessor

The abstract base class for all processors that perform pre-processing on images.

Parameters:
  • is_output (bool) – A value indicating whether this ImagePreProcessor pre-processor is the output of a Pipeline. Defaults to False.

  • is_checkpoint (bool | None) – A value indicating whether check-pointed pipeline computations should start at this point, if there exists a previously computed checkpoint value. Defaults to False.

  • io (Storable | None) – An IO object that is used to cache intermediate results of the Pipeline, which can then be re-used in this run or in subsequent runs of the Pipeline. Defaults to an instance of NoStorage.

  • kwargs (dict) – Additional keyword arguments to pass to the image pre-processing function. Defaults to an empty dict.

  • filter (int) – The order of interpolation. The order has to be in the range 0-5. Defaults to 1 (bi-linear).

  • channels_first (bool) – A value indicating whether the input data is in channels-first format or not. Defaults to True.

filter: Annotated[int, Param]

The order of interpolation. The order has to be in the range 0-5:

  • 0: Nearest-neighbor

  • 1: Bi-linear (default)

  • 2: Bi-quadratic

  • 3: Bi-cubic

  • 4: Bi-quartic

  • 5: Bi-quintic

Defaults to 1 (bi-linear).

Parameters:
Return type:

Plug

channels_first: Annotated[bool, Param]

A value indicating whether the input data is in channels-first format or not. Defaults to True.

Parameters:
Return type:

Plug

__tracked__: collections.OrderedDict[str, Any]

An collections.OrderedDict with all public class attributes, i.e., all class attributes not enclosed with double underscores.

class corelay.processor.preprocessing.Resize[source]

Bases: ImagePreProcessor

A Processor that resizes images to a specified width and height.

Parameters:
  • is_output (bool) – A value indicating whether this Resize image pre-processor is the output of a Pipeline. Defaults to False.

  • is_checkpoint (bool | None) – A value indicating whether check-pointed pipeline computations should start at this point, if there exists a previously computed checkpoint value. Defaults to False.

  • io (Storable | None) – An IO object that is used to cache intermediate results of the Pipeline, which can then be re-used in this run or in subsequent runs of the Pipeline. Defaults to an instance of NoStorage.

  • kwargs (dict) – Additional keyword arguments to pass to the image pre-processing function. Defaults to an empty dict.

  • filter (int) – The order of interpolation. The order has to be in the range 0-5. Defaults to 1 (bi-linear).

  • channels_first (bool) – A value indicating whether the input data is in channels-first format or not. Defaults to True.

  • width (int) – The width to which the images are resized. Defaults to 100.

  • height (int) – The height to which the images are resized. Defaults to 100.

width: Annotated[int, Param]

The width to which the images are resized. Defaults to 100.

Parameters:
Return type:

Plug

height: Annotated[int, Param]

The height to which the images are resized. Defaults to 100.

Parameters:
Return type:

Plug

function(data: Any) Any[source]

Resizes the input images to the specified width and height.

Parameters:

data (Any) –

The input data, which contains the images that are to be resized. The input data should be a NumPy array in one of the following formats:

  1. (number_of_samples, number_of_channels, height, width), if channels_first is set to True.

  2. (number_of_samples, height, width, number_of_channels), if channels_first is set to False.

  3. (number_of_samples, height, width).

Returns:

Returns a NumPy array containing the resized images, with a shape that matches the input data format.

Return type:

Any

__tracked__: collections.OrderedDict[str, Any]

An collections.OrderedDict with all public class attributes, i.e., all class attributes not enclosed with double underscores.

class corelay.processor.preprocessing.Rescale[source]

Bases: ImagePreProcessor

A Processor that rescales images by a specified scale factor.

Parameters:
  • is_output (bool) – A value indicating whether this Rescale image pre-processor is the output of a Pipeline. Defaults to False.

  • is_checkpoint (bool | None) – A value indicating whether check-pointed pipeline computations should start at this point, if there exists a previously computed checkpoint value. Defaults to False.

  • io (Storable | None) – An IO object that is used to cache intermediate results of the Pipeline, which can then be re-used in this run or in subsequent runs of the Pipeline. Defaults to an instance of NoStorage.

  • kwargs (dict) – Additional keyword arguments to pass to the image pre-processing function. Defaults to an empty dict.

  • filter (int) – The order of interpolation. The order has to be in the range 0-5. Defaults to 1 (bi-linear).

  • channels_first (bool) – A value indicating whether the input data is in channels-first format or not. Defaults to True.

  • scale (float) – The scale factor by which the images are rescaled. Defaults to 0.5.

__tracked__: collections.OrderedDict[str, Any]

An collections.OrderedDict with all public class attributes, i.e., all class attributes not enclosed with double underscores.

scale: Annotated[float, Param]

The scale factor by which the images are rescaled. Defaults to 0.5.

Parameters:
Return type:

Plug

function(data: Any) Any[source]

Rescales the input images by the specified scale factor.

Parameters:

data (Any) –

The input data, which contains the images that are to be rescaled. The input data should be a NumPy array in one of the following formats:

  1. (number_of_samples, number_of_channels, height, width), if channels_first is set to True.

  2. (number_of_samples, height, width, number_of_channels), if channels_first is set to

    False.

  3. (number_of_samples, height, width).

Returns:

Returns a NumPy array containing the rescaled images, with a shape that matches the input data format.

Return type:

Any

class corelay.processor.preprocessing.Pooling[source]

Bases: PreProcessor

A Processor that performs image pooling on the input data.

Parameters:
  • is_output (bool) – A value indicating whether this Pooling image pre-processor is the output of a Pipeline. Defaults to False.

  • is_checkpoint (bool | None) – A value indicating whether check-pointed pipeline computations should start at this point, if there exists a previously computed checkpoint value. Defaults to False.

  • io (Storable | None) – An IO object that is used to cache intermediate results of the Pipeline, which can then be re-used in this run or in subsequent runs of the Pipeline. Defaults to an instance of NoStorage.

  • kwargs (dict) – Additional keyword arguments to pass to the image pre-processing function. Defaults to an empty dict.

  • filter (int) – The order of interpolation. The order has to be in the range 0-5. Defaults to 1 (bi-linear).

  • channels_first (bool) – A value indicating whether the input data is in channels-first format or not. Defaults to True.

  • stride (tuple[int]) – The pooling stride, which should be of shape (number_of_samples, number_of_channels, height, width). Defaults to (1, 1, 2, 2).

  • pooling_function (FunctionType) – The pooling function to use to reduce the selected blocks. Defaults to numpy.sum().

__tracked__: collections.OrderedDict[str, Any]

An collections.OrderedDict with all public class attributes, i.e., all class attributes not enclosed with double underscores.

stride: Annotated[tuple[int], Param]

The pooling stride, which should be of shape (number_of_samples, number_of_channels, height, width). Defaults to (1, 1, 2, 2).

Parameters:
Return type:

Plug

pooling_function: Annotated[LambdaType, Param]

The pooling function to use to reduce the selected blocks. Defaults to numpy.sum().

Parameters:
Return type:

Plug

function(data: Any) Any[source]

Performs pooling on the input data.

Parameters:

data (Any) – The input data, which should be a NumPy array of shape (number_of_samples, number_of_channels, height, width).

Returns:

Returns a NumPy array containing the pooled data.

Return type:

Any