Features

The AutoCarver.features module defines a set of features used in the AutoCarver project. This module includes classes and functions to handle different types of features, such as qualitative and quantitative features.

Features

class AutoCarver.features.Features(categoricals: list[CategoricalFeature | str] | None = None, quantitatives: list[QuantitativeFeature | str] | None = None, ordinals: list[OrdinalFeature] | dict[str, list[str]] | None = None, **kwargs)

A set of typed features

Parameters:
  • categoricals (list[CategoricalFeature | str], optional) – List of categorical features or column names, by default None

  • quantitatives (list[QuantitativeFeature | str], optional) – List of quantitative features or column names, by default None

  • ordinals (list[OrdinalFeature] | dict[str, list[str]], optional) – List of ordinal features or dict column names with associated value ordering, by default None

Warning

At least one of categoricals, ordinals or quantitatives should be provided.

Keyword Arguments:
  • ordinal_encoding (bool, optional) – Whether or not to ordinal encode labels, by default False

  • nan (str, optional) – Label for missing values, by default "__NAN__"

  • default (str, optional) – Label for default values, by default "__OTHER__"

property categoricals: list[CategoricalFeature]

Returns all categorical features

classmethod load(features_json: dict) Features

Allows one to load a set of Features

Parameters:

features_json (dict) – Dictionary of serialized Features

Returns:

Loaded Features.

Return type:

Features

property names: list[str]

Returns names of all features

property ordinals: list[OrdinalFeature]

Returns all ordinal features

property qualitatives: list[QualitativeFeature]

Returns all qualitative features

property quantitatives: list[QuantitativeFeature]

Returns all quantitative features

property summary: DataFrame

Summary of discretization process for all features

to_json(light_mode: bool = False) dict

Serializes Features for JSON saving

Parameters:

light_mode (bool, optional) – Whether or not to serialize in light mode (without statistics and history), by default False

property versions: list[str]

Returns versions of all features

Qualitatitve features

class AutoCarver.features.CategoricalFeature(name: str, **kwargs)

Defines a categorical feature

Parameters:

name (str) – Name of the feature

Keyword Arguments:
  • ordinal_encoding (bool, optional) – Whether or not to ordinal encode labels, by default False

  • nan (str, optional) – Label for missing values, by default "__NAN__"

  • default (str, optional) – Label for default values, by default "__OTHER__"

property has_default: bool

Whether or not the feature has default values

property has_nan: bool

Wether or not feature has nans

property history: DataFrame

Feature’s combination history

is_categorical = True

Whether or not feature is categorical

is_ordinal = False

Whether or not feature is ordinal

is_qualitative = True

Whether or not feature is qualitative

property summary: dict

Summary of feature’s discretization process

class AutoCarver.features.OrdinalFeature(name: str, values: list[str], **kwargs)

Defines an ordinal feature

Parameters:

values (list[str]) – Ordered list of all unique values for the feature

property has_default: bool

Whether or not the feature has default values

property has_nan: bool

Wether or not feature has nans

property history: DataFrame

Feature’s combination history

is_categorical = False

Whether or not feature is categorical

is_ordinal = True

Whether or not feature is ordinal

is_qualitative = True

Whether or not feature is qualitative

property summary: dict

Summary of feature’s discretization process

Quantitative features

class AutoCarver.features.QuantitativeFeature(name: str, **kwargs)

Defines a quantitative feature

Parameters:

name (str) – Name of the feature

Keyword Arguments:
  • ordinal_encoding (bool, optional) – Whether or not to ordinal encode labels, by default False

  • nan (str, optional) – Label for missing values, by default "__NAN__"

  • default (str, optional) – Label for default values, by default "__OTHER__"

property has_default: bool

Whether or not the feature has default values

property has_nan: bool

Wether or not feature has nans

property history: DataFrame

Feature’s combination history

is_quantitative = True

Whether or not feature is quantitative

property summary: dict

Summary of feature’s discretization process