{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Setting things up\n",
    "\n",
    "## About this notebook\n",
    "\n",
    "In this notebook, we embark on a journey to enhance the predictive power of the Titanic Dataset through sophisticated preprocessing using the ``BinaryCarver`` pipeline. Designed to maximize associations in the data, ``BinaryCarver`` is a robust Python tool capable of discretizing any type of data—whether it be quantitative or qualitative. Our specific focus is on preparing the dataset for binary classification tasks, such as predicting survival outcomes.\n",
    "\n",
    "The Titanic Dataset, derived from the iconic 1912 Titanic passenger information, provides a diverse set of features ranging from socio-economic status and age to cabin location. Leveraging ``BinaryCarver``, we aim to perform association-maximizing discretization, refining both quantitative and qualitative features to create a finely tuned dataset for our binary classification endeavors.\n",
    "\n",
    "Throughout this notebook, we'll delve into the intricacies of ``BinaryCarver``'s discretization pipeline, exploring its capabilities in handling a variety of data types. Whether it's transforming passenger ages or classifying fares, ``BinaryCarver``'s adaptability ensures that every feature is optimally represented for our classification tasks.\n",
    "\n",
    "Join us in this exploration as we harness the power of ``BinaryCarver`` to preprocess the Titanic Dataset. Through effective feature engineering and discretization, we strive to create a dataset that not only captures the nuances of the Titanic passenger profiles but also sets the stage for the development of accurate and impactful binary classification models.\n",
    "\n",
    "Let's dive in and uncover the potential of ``BinaryCarver`` in transforming the Titanic Dataset for optimal predictive modeling.\n",
    "\n",
    "\n",
    "## Installation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# %pip install AutoCarver[jupyter]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Titanic Data\n",
    "\n",
    "In this example notebook, we will use the Titanic dataset.\n",
    "\n",
    "The Titanic dataset is a well-known and frequently used dataset in the field of machine learning and data science. It provides information about the passengers on board the Titanic, the famous ship that sank on its maiden voyage in 1912. The dataset is often used for predictive modeling, classification, and regression tasks.\n",
    "\n",
    "The dataset includes various features such as passengers' names, ages, genders, ticket classes, cabin information, and whether they survived or not. The primary goal when working with the Titanic dataset is often to build predictive models that can infer whether a passenger survived or perished based on their individual characteristics (binary classification)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Survived</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Name</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>Siblings/Spouses Aboard</th>\n",
       "      <th>Parents/Children Aboard</th>\n",
       "      <th>Fare</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Mr. Owen Harris Braund</td>\n",
       "      <td>male</td>\n",
       "      <td>22.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>7.2500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Mrs. John Bradley (Florence Briggs Thayer) Cum...</td>\n",
       "      <td>female</td>\n",
       "      <td>38.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>71.2833</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>Miss. Laina Heikkinen</td>\n",
       "      <td>female</td>\n",
       "      <td>26.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>7.9250</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Mrs. Jacques Heath (Lily May Peel) Futrelle</td>\n",
       "      <td>female</td>\n",
       "      <td>35.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>53.1000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Mr. William Henry Allen</td>\n",
       "      <td>male</td>\n",
       "      <td>35.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>8.0500</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   Survived  Pclass                                               Name  \\\n",
       "0         0       3                             Mr. Owen Harris Braund   \n",
       "1         1       1  Mrs. John Bradley (Florence Briggs Thayer) Cum...   \n",
       "2         1       3                              Miss. Laina Heikkinen   \n",
       "3         1       1        Mrs. Jacques Heath (Lily May Peel) Futrelle   \n",
       "4         0       3                            Mr. William Henry Allen   \n",
       "\n",
       "      Sex   Age  Siblings/Spouses Aboard  Parents/Children Aboard     Fare  \n",
       "0    male  22.0                        1                        0   7.2500  \n",
       "1  female  38.0                        1                        0  71.2833  \n",
       "2  female  26.0                        0                        0   7.9250  \n",
       "3  female  35.0                        1                        0  53.1000  \n",
       "4    male  35.0                        0                        0   8.0500  "
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "\n",
    "# URL to the Titanic dataset on Kaggle\n",
    "titanic_url = \"https://web.stanford.edu/class/archive/cs/cs109/cs109.1166/stuff/titanic.csv\"\n",
    "\n",
    "# Use pandas to read the CSV file directly from the URL\n",
    "titanic_data = pd.read_csv(titanic_url)\n",
    "\n",
    "# Display the first few rows of the dataset\n",
    "titanic_data.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Target type and Carver selection"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Survived\n",
       "0    545\n",
       "1    342\n",
       "Name: count, dtype: int64"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "target = \"Survived\"\n",
    "\n",
    "titanic_data[target].value_counts(dropna=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The target ``\"Survived\"`` is a binary target of type ``int64`` used in a classification task. Hence we will use ``AutoCarver.BinaryCarver`` and ``AutoCarver.selectors.ClassificationSelector`` in following code blocks."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data Sampling"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(np.float64(0.38552188552188554), np.float64(0.3856655290102389))"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from sklearn.model_selection import train_test_split\n",
    "\n",
    "# stratified sampling by target\n",
    "train_set, dev_set = train_test_split(titanic_data, test_size=0.33, random_state=42, stratify=titanic_data[target])\n",
    "\n",
    "# checking target rate per dataset\n",
    "train_set[target].mean(), dev_set[target].mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setting up Features to Carver"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Survived</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Name</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>Siblings/Spouses Aboard</th>\n",
       "      <th>Parents/Children Aboard</th>\n",
       "      <th>Fare</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>617</th>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Mr. Antoni Yasbeck</td>\n",
       "      <td>male</td>\n",
       "      <td>27.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>14.4542</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>489</th>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>Mr. Harry Markland Molson</td>\n",
       "      <td>male</td>\n",
       "      <td>55.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>30.5000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>871</th>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>Miss. Adele Kiamie Najib</td>\n",
       "      <td>female</td>\n",
       "      <td>15.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>7.2250</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>654</th>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Mrs. John (Catherine) Bourke</td>\n",
       "      <td>female</td>\n",
       "      <td>32.0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>15.5000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>653</th>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Mr. Alexander Radeff</td>\n",
       "      <td>male</td>\n",
       "      <td>27.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>7.8958</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Survived  Pclass                          Name     Sex   Age  \\\n",
       "617         0       3            Mr. Antoni Yasbeck    male  27.0   \n",
       "489         0       1     Mr. Harry Markland Molson    male  55.0   \n",
       "871         1       3      Miss. Adele Kiamie Najib  female  15.0   \n",
       "654         0       3  Mrs. John (Catherine) Bourke  female  32.0   \n",
       "653         0       3          Mr. Alexander Radeff    male  27.0   \n",
       "\n",
       "     Siblings/Spouses Aboard  Parents/Children Aboard     Fare  \n",
       "617                        1                        0  14.4542  \n",
       "489                        0                        0  30.5000  \n",
       "871                        0                        0   7.2250  \n",
       "654                        1                        1  15.5000  \n",
       "653                        0                        0   7.8958  "
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train_set.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Survived                     int64\n",
       "Pclass                       int64\n",
       "Name                        object\n",
       "Sex                         object\n",
       "Age                        float64\n",
       "Siblings/Spouses Aboard      int64\n",
       "Parents/Children Aboard      int64\n",
       "Fare                       float64\n",
       "dtype: object"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# column data types\n",
    "train_set.dtypes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Parents/Children Aboard\n",
       "0    438\n",
       "1     87\n",
       "2     60\n",
       "3      3\n",
       "5      3\n",
       "4      2\n",
       "6      1\n",
       "Name: count, dtype: int64"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# values taken by Parents/Children Aboard\n",
    "train_set[\"Parents/Children Aboard\"].value_counts()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Pclass\n",
       "3    326\n",
       "1    142\n",
       "2    126\n",
       "Name: count, dtype: int64"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# values taken by Pclass\n",
    "train_set[\"Pclass\"].value_counts()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The feature ``\"Pclass\"`` is of type ``\"int64\"``, but it can be considered a qualitative ordinal feature rather than a quantitative discrete feature (socio-economic status). Thus we will add it to the list of ``ordinal_features`` and set the ordering of its values in ``values_orders`` (string values). \n",
    "\n",
    "``\"Sex\"`` is the only quantitative categorical feature, it's added to the list of ``qualitative_features``.\n",
    "\n",
    "``\"Fare\"`` is the only quantitative continuous features, whilst ``\"Age\"``, ``\"Siblings/Spouses Aboard\"`` and ``\"Parents/Children Aboard\"`` can be considered as quantitative discrete features. Those four features will be added to the list of ``quantitative_features``."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(Ordinal('Pclass'), Categorical('Sex'), Quantitative('Age'))"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from AutoCarver import Features\n",
    "\n",
    "# initiating Features to carve\n",
    "features = Features(\n",
    "    categoricals=[\"Sex\"],\n",
    "    quantitatives=[\"Age\", \"Fare\", \"Siblings/Spouses Aboard\", \"Parents/Children Aboard\"],\n",
    "    ordinals={\"Pclass\": [\"1\", \"2\", \"3\"]},  # user-specified ordering for ordinal features\n",
    ")\n",
    "features[\"Pclass\"], features[\"Sex\"], features[\"Age\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Using AutoCarver"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## AutoCarver settings\n",
    "\n",
    "### Representativness of modalities\n",
    "\n",
    "The attribute ``min_freq`` allows one to choose the minimum frequency per basic modalities. It is used:\n",
    "\n",
    "- For quantitative features, to define the number of quantiles to initialy discretize the features with.\n",
    "\n",
    "- For qualitative features, to define the threshold under which a modality is grouped to either a default value or its closest modality."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "min_freq = 0.05"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Tip:** should be set between ``0.01`` (slower, preciser, less robust) and ``0.2`` (faster, more robust)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Optional: Desired number of modalities\n",
    "\n",
    "The attribute ``max_n_mod`` allows one to choose the maximum number of modalities per carved feature. It is used by **Carvers** has the upper limit of number of modalities per consecutive combination of modalities."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "max_n_mod = 5"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Tip:** should be set between ``3`` (faster, more robust) and ``7`` (slower, preciser, less robust)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Optional: Grouping NaNs\n",
    "\n",
    "The attribute ``dropna`` allows one to choose whether or not ``nan`` should be grouped with another modality. If set to ``True``, **Carvers** will first find the most suitable combination of non-``nan`` values, and then test out all possible combinations with ``nan``."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "dropna = False  # anyway, there are no nan in this dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "## Fitting AutoCarver"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* First, all qualitative features are discretized:\n",
    "    1. Using ``StringDiscretizer`` to convert them to ``str`` if not already the case\n",
    "    2. For qualitative ordinal features: using ``OrdinalDiscretizer`` for under-represented values (less frequent than ``min_freq``) to be grouped with its closest modality\n",
    "    3. For qualitative categorical features: using ``CategoricalDiscretizer`` for under-represented values (less frequent than ``min_freq``) to be grouped with a default value (``features.default=\"__OTHER__\"``)\n",
    "\n",
    "* Second, all quantitative features are discretized:\n",
    "    1. Using ``ContinuousDiscretizer`` for quantile discretization that keeps track of over-represented values (more frequent than ``min_freq``)\n",
    "    2. Using ``OrdinalDiscretizer`` for any remaining under-represented values (less frequent than ``min_freq/2``) to be grouped with its closest modality\n",
    "\n",
    "* Third, all features are carved following this recipe, for all classes of ``train_set[target]`` (except one):\n",
    "    1. The raw distribution is printed out on provided ``train_set`` and ``dev_set``. It's the output of the discretization step\n",
    "    2. Grouping modalities: all consecutive combinations of modalities are applied to ``train_set``\n",
    "    3. Computing associations: the association metric (Tschruprow's T, by default) is computed with the provided ``train_set[target]``\n",
    "    4. Combinations are sorted in descending order by association value\n",
    "    5. Testing robustness: finds the first combination that checks the following:\n",
    "        - Representativness of modalities on ``train_set`` and ``dev_set`` (all should be more frequent than ``min_freq/2``)\n",
    "        - Distinct target rates per consecutive modalities on ``train_set`` and ``dev_set`` \n",
    "        - No inversion of target rates between ``train_set`` and ``dev_set`` (same ordering of modalities by target rate)\n",
    "    6. (Optional) If requested via ``dropna=True``, and if any, all combinations of modalities with ``nan`` are applied to ``train_set`` and steps 3. and 4. are run\n",
    "    7. The carved distribution is printed out on provided ``train_set`` and ``dev_set``. It's the output of the carving step"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "------\n",
      "--- [QuantitativeDiscretizer] Fit Features(['Age', 'Fare', 'Siblings/Spouses Aboard', 'Parents/Children Aboard'])\n",
      " - [ContinuousDiscretizer] Fit Features(['Age', 'Fare', 'Siblings/Spouses Aboard', 'Parents/Children Aboard'])\n",
      " - [OrdinalDiscretizer] Fit Features(['Age', 'Fare', 'Parents/Children Aboard'])\n",
      "------\n",
      "\n",
      "------\n",
      "--- [QualitativeDiscretizer] Fit Features(['Sex', 'Pclass'])\n",
      " - [StringDiscretizer] Fit Features(['Pclass'])\n",
      " - [OrdinalDiscretizer] Fit Features(['Pclass'])\n",
      " - [CategoricalDiscretizer] Fit Features(['Sex'])\n",
      "------\n",
      "\n",
      "---------\n",
      "------ [BinaryCarver] Fit Features(['Sex', 'Pclass', 'Age', 'Fare', 'Siblings/Spouses Aboard', 'Parents/Children Aboard'])\n",
      "--- [BinaryCarver] Fit Categorical('Sex') (1/6)\n",
      " [BinaryCarver] Raw distribution\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_0af27_row0_col0, #T_0af27_row1_col1 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_0af27_row0_col1, #T_0af27_row1_col0 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_0af27\" style='display:inline'>\n",
       "  <caption>X distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"blank level0\" >&nbsp;</th>\n",
       "      <th id=\"T_0af27_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_0af27_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_0af27_level0_row0\" class=\"row_heading level0 row0\" >male</th>\n",
       "      <td id=\"T_0af27_row0_col0\" class=\"data row0 col0\" >0.1878</td>\n",
       "      <td id=\"T_0af27_row0_col1\" class=\"data row0 col1\" >0.6364</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_0af27_level0_row1\" class=\"row_heading level0 row1\" >female</th>\n",
       "      <td id=\"T_0af27_row1_col0\" class=\"data row1 col0\" >0.7315</td>\n",
       "      <td id=\"T_0af27_row1_col1\" class=\"data row1 col1\" >0.3636</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "          <style type=\"text/css\">\n",
       "#T_adc40_row0_col0, #T_adc40_row1_col1 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_adc40_row0_col1, #T_adc40_row1_col0 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_adc40\" style='display:inline'>\n",
       "  <caption>X_dev distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th id=\"T_adc40_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_adc40_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td id=\"T_adc40_row0_col0\" class=\"data row0 col0\" >0.1949</td>\n",
       "      <td id=\"T_adc40_row0_col1\" class=\"data row0 col1\" >0.6655</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_adc40_row1_col0\" class=\"data row1 col0\" >0.7653</td>\n",
       "      <td id=\"T_adc40_row1_col1\" class=\"data row1 col1\" >0.3345</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Grouping modalities   :   0%|          | 0/1 [00:00<?, ?it/s]\n",
      "Computing associations: 100%|██████████| 1/1 [00:00<?, ?it/s]\n",
      "Testing robustness    :   0%|          | 0/1 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      " [BinaryCarver] Carved distribution\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_9cdd8_row0_col0, #T_9cdd8_row1_col1 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_9cdd8_row0_col1, #T_9cdd8_row1_col0 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_9cdd8\" style='display:inline'>\n",
       "  <caption>X distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"blank level0\" >&nbsp;</th>\n",
       "      <th id=\"T_9cdd8_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_9cdd8_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_9cdd8_level0_row0\" class=\"row_heading level0 row0\" >male</th>\n",
       "      <td id=\"T_9cdd8_row0_col0\" class=\"data row0 col0\" >0.1878</td>\n",
       "      <td id=\"T_9cdd8_row0_col1\" class=\"data row0 col1\" >0.6364</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_9cdd8_level0_row1\" class=\"row_heading level0 row1\" >female</th>\n",
       "      <td id=\"T_9cdd8_row1_col0\" class=\"data row1 col0\" >0.7315</td>\n",
       "      <td id=\"T_9cdd8_row1_col1\" class=\"data row1 col1\" >0.3636</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "          <style type=\"text/css\">\n",
       "#T_80338_row0_col0, #T_80338_row1_col1 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_80338_row0_col1, #T_80338_row1_col0 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_80338\" style='display:inline'>\n",
       "  <caption>X_dev distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th id=\"T_80338_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_80338_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td id=\"T_80338_row0_col0\" class=\"data row0 col0\" >0.1949</td>\n",
       "      <td id=\"T_80338_row0_col1\" class=\"data row0 col1\" >0.6655</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_80338_row1_col0\" class=\"data row1 col0\" >0.7653</td>\n",
       "      <td id=\"T_80338_row1_col1\" class=\"data row1 col1\" >0.3345</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--- [BinaryCarver] Fit Ordinal('Pclass') (2/6)\n",
      " [BinaryCarver] Raw distribution\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_4738a_row0_col0, #T_4738a_row2_col1 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_4738a_row0_col1 {\n",
       "  background-color: #536edd;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_4738a_row1_col0 {\n",
       "  background-color: #f0cdbb;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_4738a_row1_col1, #T_4738a_row2_col0 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_4738a\" style='display:inline'>\n",
       "  <caption>X distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"blank level0\" >&nbsp;</th>\n",
       "      <th id=\"T_4738a_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_4738a_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_4738a_level0_row0\" class=\"row_heading level0 row0\" >1</th>\n",
       "      <td id=\"T_4738a_row0_col0\" class=\"data row0 col0\" >0.6197</td>\n",
       "      <td id=\"T_4738a_row0_col1\" class=\"data row0 col1\" >0.2391</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_4738a_level0_row1\" class=\"row_heading level0 row1\" >2</th>\n",
       "      <td id=\"T_4738a_row1_col0\" class=\"data row1 col0\" >0.4683</td>\n",
       "      <td id=\"T_4738a_row1_col1\" class=\"data row1 col1\" >0.2121</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_4738a_level0_row2\" class=\"row_heading level0 row2\" >3</th>\n",
       "      <td id=\"T_4738a_row2_col0\" class=\"data row2 col0\" >0.2515</td>\n",
       "      <td id=\"T_4738a_row2_col1\" class=\"data row2 col1\" >0.5488</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "          <style type=\"text/css\">\n",
       "#T_94ff3_row0_col0, #T_94ff3_row2_col1 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_94ff3_row0_col1 {\n",
       "  background-color: #6b8df0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_94ff3_row1_col0 {\n",
       "  background-color: #f2cab5;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_94ff3_row1_col1, #T_94ff3_row2_col0 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_94ff3\" style='display:inline'>\n",
       "  <caption>X_dev distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th id=\"T_94ff3_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_94ff3_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td id=\"T_94ff3_row0_col0\" class=\"data row0 col0\" >0.6486</td>\n",
       "      <td id=\"T_94ff3_row0_col1\" class=\"data row0 col1\" >0.2526</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_94ff3_row1_col0\" class=\"data row1 col0\" >0.4828</td>\n",
       "      <td id=\"T_94ff3_row1_col1\" class=\"data row1 col1\" >0.1980</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_94ff3_row2_col0\" class=\"data row2 col0\" >0.2298</td>\n",
       "      <td id=\"T_94ff3_row2_col1\" class=\"data row2 col1\" >0.5495</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Grouping modalities   :  67%|██████▋   | 2/3 [00:00<00:00, 1988.76it/s]\n",
      "Computing associations: 100%|██████████| 3/3 [00:00<00:00, 2981.03it/s]\n",
      "Testing robustness    :   0%|          | 0/3 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      " [BinaryCarver] Carved distribution\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_a09fa_row0_col0, #T_a09fa_row1_col1 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_a09fa_row0_col1, #T_a09fa_row1_col0 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_a09fa\" style='display:inline'>\n",
       "  <caption>X distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"blank level0\" >&nbsp;</th>\n",
       "      <th id=\"T_a09fa_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_a09fa_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_a09fa_level0_row0\" class=\"row_heading level0 row0\" >1 to 2</th>\n",
       "      <td id=\"T_a09fa_row0_col0\" class=\"data row0 col0\" >0.5485</td>\n",
       "      <td id=\"T_a09fa_row0_col1\" class=\"data row0 col1\" >0.4512</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a09fa_level0_row1\" class=\"row_heading level0 row1\" >3</th>\n",
       "      <td id=\"T_a09fa_row1_col0\" class=\"data row1 col0\" >0.2515</td>\n",
       "      <td id=\"T_a09fa_row1_col1\" class=\"data row1 col1\" >0.5488</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "          <style type=\"text/css\">\n",
       "#T_fd76b_row0_col0, #T_fd76b_row1_col1 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_fd76b_row0_col1, #T_fd76b_row1_col0 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_fd76b\" style='display:inline'>\n",
       "  <caption>X_dev distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th id=\"T_fd76b_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_fd76b_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td id=\"T_fd76b_row0_col0\" class=\"data row0 col0\" >0.5758</td>\n",
       "      <td id=\"T_fd76b_row0_col1\" class=\"data row0 col1\" >0.4505</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_fd76b_row1_col0\" class=\"data row1 col0\" >0.2298</td>\n",
       "      <td id=\"T_fd76b_row1_col1\" class=\"data row1 col1\" >0.5495</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--- [BinaryCarver] Fit Quantitative('Age') (3/6)\n",
      " [BinaryCarver] Raw distribution\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_a4588_row0_col0, #T_a4588_row9_col1 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_a4588_row0_col1, #T_a4588_row18_col1 {\n",
       "  background-color: #90b2fe;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row1_col0, #T_a4588_row27_col0 {\n",
       "  background-color: #c83836;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_a4588_row1_col1, #T_a4588_row2_col1, #T_a4588_row27_col1 {\n",
       "  background-color: #7295f4;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_a4588_row2_col0 {\n",
       "  background-color: #dcdddd;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row3_col0, #T_a4588_row31_col0 {\n",
       "  background-color: #6687ed;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_a4588_row3_col1, #T_a4588_row12_col1, #T_a4588_row14_col0, #T_a4588_row26_col0, #T_a4588_row31_col1 {\n",
       "  background-color: #81a4fb;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_a4588_row4_col0, #T_a4588_row11_col1, #T_a4588_row16_col1, #T_a4588_row18_col0 {\n",
       "  background-color: #f2c9b4;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row4_col1, #T_a4588_row7_col1, #T_a4588_row15_col1, #T_a4588_row29_col1 {\n",
       "  background-color: #afcafc;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row5_col0 {\n",
       "  background-color: #a9c6fd;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row5_col1, #T_a4588_row23_col1 {\n",
       "  background-color: #d24b40;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_a4588_row6_col0, #T_a4588_row25_col0 {\n",
       "  background-color: #ccd9ed;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row6_col1, #T_a4588_row25_col1 {\n",
       "  background-color: #ecd3c5;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row7_col0, #T_a4588_row10_col1 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_a4588_row8_col0 {\n",
       "  background-color: #6180e9;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_a4588_row8_col1 {\n",
       "  background-color: #d7dce3;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row9_col0 {\n",
       "  background-color: #dadce0;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row10_col0 {\n",
       "  background-color: #cfdaea;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row11_col0 {\n",
       "  background-color: #f7b599;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row12_col0 {\n",
       "  background-color: #445acc;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_a4588_row13_col0 {\n",
       "  background-color: #ead4c8;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row13_col1 {\n",
       "  background-color: #df634e;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_a4588_row14_col1, #T_a4588_row29_col0 {\n",
       "  background-color: #cbd8ee;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row15_col0 {\n",
       "  background-color: #e2dad5;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row16_col0 {\n",
       "  background-color: #98b9ff;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row17_col0, #T_a4588_row19_col0 {\n",
       "  background-color: #c9d7f0;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row17_col1, #T_a4588_row19_col1, #T_a4588_row20_col1, #T_a4588_row24_col1, #T_a4588_row30_col1 {\n",
       "  background-color: #6384eb;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_a4588_row20_col0 {\n",
       "  background-color: #a1c0ff;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row21_col0 {\n",
       "  background-color: #ead5c9;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row21_col1 {\n",
       "  background-color: #f08b6e;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_a4588_row22_col0 {\n",
       "  background-color: #d5dbe5;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row22_col1 {\n",
       "  background-color: #9fbfff;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row23_col0 {\n",
       "  background-color: #cad8ef;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row24_col0 {\n",
       "  background-color: #e9d5cb;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row26_col1 {\n",
       "  background-color: #5572df;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_a4588_row28_col0 {\n",
       "  background-color: #8db0fe;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_a4588_row28_col1 {\n",
       "  background-color: #485fd1;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_a4588_row30_col0 {\n",
       "  background-color: #4f69d9;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_a4588\" style='display:inline'>\n",
       "  <caption>X distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"blank level0\" >&nbsp;</th>\n",
       "      <th id=\"T_a4588_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_a4588_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row0\" class=\"row_heading level0 row0\" >x <= 2.00e+00</th>\n",
       "      <td id=\"T_a4588_row0_col0\" class=\"data row0 col0\" >0.7500</td>\n",
       "      <td id=\"T_a4588_row0_col1\" class=\"data row0 col1\" >0.0269</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row1\" class=\"row_heading level0 row1\" >2.00e+00 < x <= 4.00e+00</th>\n",
       "      <td id=\"T_a4588_row1_col0\" class=\"data row1 col0\" >0.7143</td>\n",
       "      <td id=\"T_a4588_row1_col1\" class=\"data row1 col1\" >0.0236</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row2\" class=\"row_heading level0 row2\" >4.00e+00 < x <= 8.00e+00</th>\n",
       "      <td id=\"T_a4588_row2_col0\" class=\"data row2 col0\" >0.4286</td>\n",
       "      <td id=\"T_a4588_row2_col1\" class=\"data row2 col1\" >0.0236</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row3\" class=\"row_heading level0 row3\" >8.00e+00 < x <= 1.40e+01</th>\n",
       "      <td id=\"T_a4588_row3_col0\" class=\"data row3 col0\" >0.2000</td>\n",
       "      <td id=\"T_a4588_row3_col1\" class=\"data row3 col1\" >0.0253</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row4\" class=\"row_heading level0 row4\" >1.40e+01 < x <= 1.60e+01</th>\n",
       "      <td id=\"T_a4588_row4_col0\" class=\"data row4 col0\" >0.5000</td>\n",
       "      <td id=\"T_a4588_row4_col1\" class=\"data row4 col1\" >0.0303</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row5\" class=\"row_heading level0 row5\" >1.60e+01 < x <= 1.80e+01</th>\n",
       "      <td id=\"T_a4588_row5_col0\" class=\"data row5 col0\" >0.3226</td>\n",
       "      <td id=\"T_a4588_row5_col1\" class=\"data row5 col1\" >0.0522</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row6\" class=\"row_heading level0 row6\" >1.80e+01 < x <= 1.90e+01</th>\n",
       "      <td id=\"T_a4588_row6_col0\" class=\"data row6 col0\" >0.3913</td>\n",
       "      <td id=\"T_a4588_row6_col1\" class=\"data row6 col1\" >0.0387</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row7\" class=\"row_heading level0 row7\" >1.90e+01 < x <= 2.05e+01</th>\n",
       "      <td id=\"T_a4588_row7_col0\" class=\"data row7 col0\" >0.1111</td>\n",
       "      <td id=\"T_a4588_row7_col1\" class=\"data row7 col1\" >0.0303</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row8\" class=\"row_heading level0 row8\" >2.05e+01 < x <= 2.10e+01</th>\n",
       "      <td id=\"T_a4588_row8_col0\" class=\"data row8 col0\" >0.1905</td>\n",
       "      <td id=\"T_a4588_row8_col1\" class=\"data row8 col1\" >0.0354</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row9\" class=\"row_heading level0 row9\" >2.10e+01 < x <= 2.20e+01</th>\n",
       "      <td id=\"T_a4588_row9_col0\" class=\"data row9 col0\" >0.4242</td>\n",
       "      <td id=\"T_a4588_row9_col1\" class=\"data row9 col1\" >0.0556</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row10\" class=\"row_heading level0 row10\" >2.20e+01 < x <= 2.35e+01</th>\n",
       "      <td id=\"T_a4588_row10_col0\" class=\"data row10 col0\" >0.4000</td>\n",
       "      <td id=\"T_a4588_row10_col1\" class=\"data row10 col1\" >0.0168</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row11\" class=\"row_heading level0 row11\" >2.35e+01 < x <= 2.40e+01</th>\n",
       "      <td id=\"T_a4588_row11_col0\" class=\"data row11 col0\" >0.5417</td>\n",
       "      <td id=\"T_a4588_row11_col1\" class=\"data row11 col1\" >0.0404</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row12\" class=\"row_heading level0 row12\" >2.40e+01 < x <= 2.50e+01</th>\n",
       "      <td id=\"T_a4588_row12_col0\" class=\"data row12 col0\" >0.1333</td>\n",
       "      <td id=\"T_a4588_row12_col1\" class=\"data row12 col1\" >0.0253</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row13\" class=\"row_heading level0 row13\" >2.50e+01 < x <= 2.70e+01</th>\n",
       "      <td id=\"T_a4588_row13_col0\" class=\"data row13 col0\" >0.4667</td>\n",
       "      <td id=\"T_a4588_row13_col1\" class=\"data row13 col1\" >0.0505</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row14\" class=\"row_heading level0 row14\" >2.70e+01 < x <= 2.85e+01</th>\n",
       "      <td id=\"T_a4588_row14_col0\" class=\"data row14 col0\" >0.2500</td>\n",
       "      <td id=\"T_a4588_row14_col1\" class=\"data row14 col1\" >0.0337</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row15\" class=\"row_heading level0 row15\" >2.85e+01 < x <= 2.90e+01</th>\n",
       "      <td id=\"T_a4588_row15_col0\" class=\"data row15 col0\" >0.4444</td>\n",
       "      <td id=\"T_a4588_row15_col1\" class=\"data row15 col1\" >0.0303</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row16\" class=\"row_heading level0 row16\" >2.90e+01 < x <= 3.00e+01</th>\n",
       "      <td id=\"T_a4588_row16_col0\" class=\"data row16 col0\" >0.2917</td>\n",
       "      <td id=\"T_a4588_row16_col1\" class=\"data row16 col1\" >0.0404</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row17\" class=\"row_heading level0 row17\" >3.00e+01 < x <= 3.10e+01</th>\n",
       "      <td id=\"T_a4588_row17_col0\" class=\"data row17 col0\" >0.3846</td>\n",
       "      <td id=\"T_a4588_row17_col1\" class=\"data row17 col1\" >0.0219</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row18\" class=\"row_heading level0 row18\" >3.10e+01 < x <= 3.20e+01</th>\n",
       "      <td id=\"T_a4588_row18_col0\" class=\"data row18 col0\" >0.5000</td>\n",
       "      <td id=\"T_a4588_row18_col1\" class=\"data row18 col1\" >0.0269</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row19\" class=\"row_heading level0 row19\" >3.20e+01 < x <= 3.30e+01</th>\n",
       "      <td id=\"T_a4588_row19_col0\" class=\"data row19 col0\" >0.3846</td>\n",
       "      <td id=\"T_a4588_row19_col1\" class=\"data row19 col1\" >0.0219</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row20\" class=\"row_heading level0 row20\" >3.30e+01 < x <= 3.40e+01</th>\n",
       "      <td id=\"T_a4588_row20_col0\" class=\"data row20 col0\" >0.3077</td>\n",
       "      <td id=\"T_a4588_row20_col1\" class=\"data row20 col1\" >0.0219</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row21\" class=\"row_heading level0 row21\" >3.40e+01 < x <= 3.60e+01</th>\n",
       "      <td id=\"T_a4588_row21_col0\" class=\"data row21 col0\" >0.4643</td>\n",
       "      <td id=\"T_a4588_row21_col1\" class=\"data row21 col1\" >0.0471</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row22\" class=\"row_heading level0 row22\" >3.60e+01 < x <= 3.80e+01</th>\n",
       "      <td id=\"T_a4588_row22_col0\" class=\"data row22 col0\" >0.4118</td>\n",
       "      <td id=\"T_a4588_row22_col1\" class=\"data row22 col1\" >0.0286</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row23\" class=\"row_heading level0 row23\" >3.80e+01 < x <= 4.10e+01</th>\n",
       "      <td id=\"T_a4588_row23_col0\" class=\"data row23 col0\" >0.3871</td>\n",
       "      <td id=\"T_a4588_row23_col1\" class=\"data row23 col1\" >0.0522</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row24\" class=\"row_heading level0 row24\" >4.10e+01 < x <= 4.20e+01</th>\n",
       "      <td id=\"T_a4588_row24_col0\" class=\"data row24 col0\" >0.4615</td>\n",
       "      <td id=\"T_a4588_row24_col1\" class=\"data row24 col1\" >0.0219</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row25\" class=\"row_heading level0 row25\" >4.20e+01 < x <= 4.50e+01</th>\n",
       "      <td id=\"T_a4588_row25_col0\" class=\"data row25 col0\" >0.3913</td>\n",
       "      <td id=\"T_a4588_row25_col1\" class=\"data row25 col1\" >0.0387</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row26\" class=\"row_heading level0 row26\" >4.50e+01 < x <= 4.70e+01</th>\n",
       "      <td id=\"T_a4588_row26_col0\" class=\"data row26 col0\" >0.2500</td>\n",
       "      <td id=\"T_a4588_row26_col1\" class=\"data row26 col1\" >0.0202</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row27\" class=\"row_heading level0 row27\" >4.70e+01 < x <= 4.90e+01</th>\n",
       "      <td id=\"T_a4588_row27_col0\" class=\"data row27 col0\" >0.7143</td>\n",
       "      <td id=\"T_a4588_row27_col1\" class=\"data row27 col1\" >0.0236</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row28\" class=\"row_heading level0 row28\" >4.90e+01 < x <= 5.10e+01</th>\n",
       "      <td id=\"T_a4588_row28_col0\" class=\"data row28 col0\" >0.2727</td>\n",
       "      <td id=\"T_a4588_row28_col1\" class=\"data row28 col1\" >0.0185</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row29\" class=\"row_heading level0 row29\" >5.10e+01 < x <= 5.60e+01</th>\n",
       "      <td id=\"T_a4588_row29_col0\" class=\"data row29 col0\" >0.3889</td>\n",
       "      <td id=\"T_a4588_row29_col1\" class=\"data row29 col1\" >0.0303</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row30\" class=\"row_heading level0 row30\" >5.60e+01 < x <= 6.10e+01</th>\n",
       "      <td id=\"T_a4588_row30_col0\" class=\"data row30 col0\" >0.1538</td>\n",
       "      <td id=\"T_a4588_row30_col1\" class=\"data row30 col1\" >0.0219</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_a4588_level0_row31\" class=\"row_heading level0 row31\" >6.10e+01 < x</th>\n",
       "      <td id=\"T_a4588_row31_col0\" class=\"data row31 col0\" >0.2000</td>\n",
       "      <td id=\"T_a4588_row31_col1\" class=\"data row31 col1\" >0.0253</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "          <style type=\"text/css\">\n",
       "#T_7fecf_row0_col0 {\n",
       "  background-color: #dcdddd;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_7fecf_row0_col1, #T_7fecf_row16_col1 {\n",
       "  background-color: #9dbdff;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_7fecf_row1_col0, #T_7fecf_row5_col1, #T_7fecf_row20_col0 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_7fecf_row1_col1, #T_7fecf_row20_col1, #T_7fecf_row24_col1, #T_7fecf_row25_col0 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_7fecf_row2_col0, #T_7fecf_row19_col0, #T_7fecf_row28_col0 {\n",
       "  background-color: #e16751;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_7fecf_row2_col1, #T_7fecf_row7_col1, #T_7fecf_row9_col1, #T_7fecf_row19_col1, #T_7fecf_row26_col1, #T_7fecf_row27_col1, #T_7fecf_row28_col1, #T_7fecf_row29_col1 {\n",
       "  background-color: #5f7fe8;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_7fecf_row3_col0, #T_7fecf_row30_col0 {\n",
       "  background-color: #f4987a;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_7fecf_row3_col1, #T_7fecf_row6_col1, #T_7fecf_row11_col1, #T_7fecf_row12_col1 {\n",
       "  background-color: #b1cbfc;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_7fecf_row4_col0, #T_7fecf_row15_col1, #T_7fecf_row22_col0, #T_7fecf_row25_col1, #T_7fecf_row31_col0 {\n",
       "  background-color: #7396f5;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_7fecf_row4_col1, #T_7fecf_row17_col1, #T_7fecf_row22_col1, #T_7fecf_row31_col1 {\n",
       "  background-color: #88abfd;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_7fecf_row5_col0, #T_7fecf_row15_col0 {\n",
       "  background-color: #d5dbe5;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_7fecf_row6_col0 {\n",
       "  background-color: #5875e1;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_7fecf_row7_col0, #T_7fecf_row16_col0 {\n",
       "  background-color: #a3c2fe;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_7fecf_row8_col0 {\n",
       "  background-color: #3f53c6;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_7fecf_row8_col1 {\n",
       "  background-color: #e4d9d2;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_7fecf_row9_col0, #T_7fecf_row26_col0 {\n",
       "  background-color: #465ecf;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_7fecf_row10_col0, #T_7fecf_row23_col0 {\n",
       "  background-color: #506bda;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_7fecf_row10_col1, #T_7fecf_row23_col1 {\n",
       "  background-color: #f7aa8c;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_7fecf_row11_col0, #T_7fecf_row12_col0, #T_7fecf_row24_col0, #T_7fecf_row27_col0, #T_7fecf_row29_col0 {\n",
       "  background-color: #f0cdbb;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_7fecf_row13_col0 {\n",
       "  background-color: #aec9fc;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_7fecf_row13_col1, #T_7fecf_row21_col1 {\n",
       "  background-color: #f39475;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_7fecf_row14_col0 {\n",
       "  background-color: #7a9df8;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_7fecf_row14_col1 {\n",
       "  background-color: #dc5d4a;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_7fecf_row17_col0 {\n",
       "  background-color: #ee8669;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_7fecf_row18_col0 {\n",
       "  background-color: #c7d7f0;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_7fecf_row18_col1, #T_7fecf_row30_col1 {\n",
       "  background-color: #4c66d6;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_7fecf_row21_col0 {\n",
       "  background-color: #f5a081;\n",
       "  color: #000000;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_7fecf\" style='display:inline'>\n",
       "  <caption>X_dev distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th id=\"T_7fecf_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_7fecf_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row0_col0\" class=\"data row0 col0\" >0.4444</td>\n",
       "      <td id=\"T_7fecf_row0_col1\" class=\"data row0 col1\" >0.0307</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row1_col0\" class=\"data row1 col0\" >0.7500</td>\n",
       "      <td id=\"T_7fecf_row1_col1\" class=\"data row1 col1\" >0.0137</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row2_col0\" class=\"data row2 col0\" >0.6667</td>\n",
       "      <td id=\"T_7fecf_row2_col1\" class=\"data row2 col1\" >0.0205</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row3_col0\" class=\"data row3 col0\" >0.6000</td>\n",
       "      <td id=\"T_7fecf_row3_col1\" class=\"data row3 col1\" >0.0341</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row4_col0\" class=\"data row4 col0\" >0.2500</td>\n",
       "      <td id=\"T_7fecf_row4_col1\" class=\"data row4 col1\" >0.0273</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row5_col0\" class=\"data row5 col0\" >0.4286</td>\n",
       "      <td id=\"T_7fecf_row5_col1\" class=\"data row5 col1\" >0.0717</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row6_col0\" class=\"data row6 col0\" >0.2000</td>\n",
       "      <td id=\"T_7fecf_row6_col1\" class=\"data row6 col1\" >0.0341</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row7_col0\" class=\"data row7 col0\" >0.3333</td>\n",
       "      <td id=\"T_7fecf_row7_col1\" class=\"data row7 col1\" >0.0205</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row8_col0\" class=\"data row8 col0\" >0.1538</td>\n",
       "      <td id=\"T_7fecf_row8_col1\" class=\"data row8 col1\" >0.0444</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row9_col0\" class=\"data row9 col0\" >0.1667</td>\n",
       "      <td id=\"T_7fecf_row9_col1\" class=\"data row9 col1\" >0.0205</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row10_col0\" class=\"data row10 col0\" >0.1875</td>\n",
       "      <td id=\"T_7fecf_row10_col1\" class=\"data row10 col1\" >0.0546</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row11_col0\" class=\"data row11 col0\" >0.5000</td>\n",
       "      <td id=\"T_7fecf_row11_col1\" class=\"data row11 col1\" >0.0341</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row12_col0\" class=\"data row12 col0\" >0.5000</td>\n",
       "      <td id=\"T_7fecf_row12_col1\" class=\"data row12 col1\" >0.0341</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row13_col0\" class=\"data row13 col0\" >0.3529</td>\n",
       "      <td id=\"T_7fecf_row13_col1\" class=\"data row13 col1\" >0.0580</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row14_col0\" class=\"data row14 col0\" >0.2632</td>\n",
       "      <td id=\"T_7fecf_row14_col1\" class=\"data row14 col1\" >0.0648</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row15_col0\" class=\"data row15 col0\" >0.4286</td>\n",
       "      <td id=\"T_7fecf_row15_col1\" class=\"data row15 col1\" >0.0239</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row16_col0\" class=\"data row16 col0\" >0.3333</td>\n",
       "      <td id=\"T_7fecf_row16_col1\" class=\"data row16 col1\" >0.0307</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row17_col0\" class=\"data row17 col0\" >0.6250</td>\n",
       "      <td id=\"T_7fecf_row17_col1\" class=\"data row17 col1\" >0.0273</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row18_col0\" class=\"data row18 col0\" >0.4000</td>\n",
       "      <td id=\"T_7fecf_row18_col1\" class=\"data row18 col1\" >0.0171</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row19_col0\" class=\"data row19 col0\" >0.6667</td>\n",
       "      <td id=\"T_7fecf_row19_col1\" class=\"data row19 col1\" >0.0205</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row20_col0\" class=\"data row20 col0\" >0.7500</td>\n",
       "      <td id=\"T_7fecf_row20_col1\" class=\"data row20 col1\" >0.0137</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row21_col0\" class=\"data row21 col0\" >0.5882</td>\n",
       "      <td id=\"T_7fecf_row21_col1\" class=\"data row21 col1\" >0.0580</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row22_col0\" class=\"data row22 col0\" >0.2500</td>\n",
       "      <td id=\"T_7fecf_row22_col1\" class=\"data row22 col1\" >0.0273</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row23_col0\" class=\"data row23 col0\" >0.1875</td>\n",
       "      <td id=\"T_7fecf_row23_col1\" class=\"data row23 col1\" >0.0546</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row24_col0\" class=\"data row24 col0\" >0.5000</td>\n",
       "      <td id=\"T_7fecf_row24_col1\" class=\"data row24 col1\" >0.0137</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row25_col0\" class=\"data row25 col0\" >0.1429</td>\n",
       "      <td id=\"T_7fecf_row25_col1\" class=\"data row25 col1\" >0.0239</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row26_col0\" class=\"data row26 col0\" >0.1667</td>\n",
       "      <td id=\"T_7fecf_row26_col1\" class=\"data row26 col1\" >0.0205</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row27_col0\" class=\"data row27 col0\" >0.5000</td>\n",
       "      <td id=\"T_7fecf_row27_col1\" class=\"data row27 col1\" >0.0205</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row28_col0\" class=\"data row28 col0\" >0.6667</td>\n",
       "      <td id=\"T_7fecf_row28_col1\" class=\"data row28 col1\" >0.0205</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row29_col0\" class=\"data row29 col0\" >0.5000</td>\n",
       "      <td id=\"T_7fecf_row29_col1\" class=\"data row29 col1\" >0.0205</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row30_col0\" class=\"data row30 col0\" >0.6000</td>\n",
       "      <td id=\"T_7fecf_row30_col1\" class=\"data row30 col1\" >0.0171</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_7fecf_row31_col0\" class=\"data row31 col0\" >0.2500</td>\n",
       "      <td id=\"T_7fecf_row31_col1\" class=\"data row31 col1\" >0.0273</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Grouping modalities   : 100%|█████████▉| 36455/36456 [00:04<00:00, 7444.66it/s]\n",
      "Computing associations: 100%|██████████| 36456/36456 [00:10<00:00, 3595.62it/s]\n",
      "Testing robustness    :   1%|          | 302/36456 [00:00<01:49, 328.89it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      " [BinaryCarver] Carved distribution\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_4bf75_row0_col0, #T_4bf75_row1_col1 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_4bf75_row0_col1, #T_4bf75_row1_col0 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_4bf75\" style='display:inline'>\n",
       "  <caption>X distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"blank level0\" >&nbsp;</th>\n",
       "      <th id=\"T_4bf75_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_4bf75_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_4bf75_level0_row0\" class=\"row_heading level0 row0\" >x <= 8.0e+00</th>\n",
       "      <td id=\"T_4bf75_row0_col0\" class=\"data row0 col0\" >0.6364</td>\n",
       "      <td id=\"T_4bf75_row0_col1\" class=\"data row0 col1\" >0.0741</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_4bf75_level0_row1\" class=\"row_heading level0 row1\" >8.0e+00 < x</th>\n",
       "      <td id=\"T_4bf75_row1_col0\" class=\"data row1 col0\" >0.3655</td>\n",
       "      <td id=\"T_4bf75_row1_col1\" class=\"data row1 col1\" >0.9259</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "          <style type=\"text/css\">\n",
       "#T_bbf87_row0_col0, #T_bbf87_row1_col1 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_bbf87_row0_col1, #T_bbf87_row1_col0 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_bbf87\" style='display:inline'>\n",
       "  <caption>X_dev distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th id=\"T_bbf87_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_bbf87_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td id=\"T_bbf87_row0_col0\" class=\"data row0 col0\" >0.5789</td>\n",
       "      <td id=\"T_bbf87_row0_col1\" class=\"data row0 col1\" >0.0648</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_bbf87_row1_col0\" class=\"data row1 col0\" >0.3723</td>\n",
       "      <td id=\"T_bbf87_row1_col1\" class=\"data row1 col1\" >0.9352</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--- [BinaryCarver] Fit Quantitative('Fare') (4/6)\n",
      " [BinaryCarver] Raw distribution\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_2ea6b_row0_col0, #T_2ea6b_row3_col1 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_2ea6b_row0_col1, #T_2ea6b_row7_col1, #T_2ea6b_row9_col1, #T_2ea6b_row31_col1 {\n",
       "  background-color: #7295f4;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_2ea6b_row1_col0 {\n",
       "  background-color: #6b8df0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_2ea6b_row1_col1, #T_2ea6b_row13_col1, #T_2ea6b_row14_col1, #T_2ea6b_row22_col1, #T_2ea6b_row24_col1, #T_2ea6b_row27_col1, #T_2ea6b_row29_col1 {\n",
       "  background-color: #6687ed;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_2ea6b_row2_col0 {\n",
       "  background-color: #9ebeff;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_2ea6b_row2_col1, #T_2ea6b_row19_col1 {\n",
       "  background-color: #96b7ff;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_2ea6b_row3_col0, #T_2ea6b_row10_col1, #T_2ea6b_row11_col1, #T_2ea6b_row16_col1, #T_2ea6b_row17_col1, #T_2ea6b_row23_col1, #T_2ea6b_row26_col1, #T_2ea6b_row30_col1, #T_2ea6b_row32_col1 {\n",
       "  background-color: #5a78e4;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_2ea6b_row4_col0 {\n",
       "  background-color: #c0d4f5;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_2ea6b_row4_col1, #T_2ea6b_row31_col0 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_2ea6b_row5_col0, #T_2ea6b_row13_col0, #T_2ea6b_row18_col0, #T_2ea6b_row22_col0 {\n",
       "  background-color: #bad0f8;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_2ea6b_row5_col1 {\n",
       "  background-color: #cedaeb;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_2ea6b_row6_col0 {\n",
       "  background-color: #6e90f2;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_2ea6b_row6_col1, #T_2ea6b_row12_col0 {\n",
       "  background-color: #f0cdbb;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_2ea6b_row7_col0 {\n",
       "  background-color: #edd1c2;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_2ea6b_row8_col0 {\n",
       "  background-color: #5d7ce6;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_2ea6b_row8_col1 {\n",
       "  background-color: #f7b093;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_2ea6b_row9_col0 {\n",
       "  background-color: #6788ee;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_2ea6b_row10_col0, #T_2ea6b_row11_col0 {\n",
       "  background-color: #c3d5f4;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_2ea6b_row12_col1 {\n",
       "  background-color: #c0282f;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_2ea6b_row14_col0 {\n",
       "  background-color: #85a8fc;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_2ea6b_row15_col0 {\n",
       "  background-color: #f7b89c;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_2ea6b_row15_col1, #T_2ea6b_row20_col1, #T_2ea6b_row28_col1 {\n",
       "  background-color: #445acc;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_2ea6b_row16_col0 {\n",
       "  background-color: #f7bca1;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_2ea6b_row17_col0 {\n",
       "  background-color: #dadce0;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_2ea6b_row18_col1 {\n",
       "  background-color: #e16751;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_2ea6b_row19_col0 {\n",
       "  background-color: #f7b99e;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_2ea6b_row20_col0 {\n",
       "  background-color: #9abbff;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_2ea6b_row21_col0 {\n",
       "  background-color: #e4d9d2;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_2ea6b_row21_col1 {\n",
       "  background-color: #4f69d9;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_2ea6b_row23_col0 {\n",
       "  background-color: #a7c5fe;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_2ea6b_row24_col0 {\n",
       "  background-color: #e5d8d1;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_2ea6b_row25_col0 {\n",
       "  background-color: #93b5fe;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_2ea6b_row25_col1 {\n",
       "  background-color: #7da0f9;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_2ea6b_row26_col0, #T_2ea6b_row30_col0 {\n",
       "  background-color: #d75445;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_2ea6b_row27_col0 {\n",
       "  background-color: #f3c8b2;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_2ea6b_row28_col0 {\n",
       "  background-color: #d6dce4;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_2ea6b_row29_col0 {\n",
       "  background-color: #d1493f;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_2ea6b_row32_col0 {\n",
       "  background-color: #eb7d62;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_2ea6b\" style='display:inline'>\n",
       "  <caption>X distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"blank level0\" >&nbsp;</th>\n",
       "      <th id=\"T_2ea6b_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_2ea6b_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row0\" class=\"row_heading level0 row0\" >x <= 6.858e+00</th>\n",
       "      <td id=\"T_2ea6b_row0_col0\" class=\"data row0 col0\" >0.0000</td>\n",
       "      <td id=\"T_2ea6b_row0_col1\" class=\"data row0 col1\" >0.0269</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row1\" class=\"row_heading level0 row1\" >6.858e+00 < x <= 7.142e+00</th>\n",
       "      <td id=\"T_2ea6b_row1_col0\" class=\"data row1 col0\" >0.1333</td>\n",
       "      <td id=\"T_2ea6b_row1_col1\" class=\"data row1 col1\" >0.0253</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row2\" class=\"row_heading level0 row2\" >7.142e+00 < x <= 7.229e+00</th>\n",
       "      <td id=\"T_2ea6b_row2_col0\" class=\"data row2 col0\" >0.2632</td>\n",
       "      <td id=\"T_2ea6b_row2_col1\" class=\"data row2 col1\" >0.0320</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row3\" class=\"row_heading level0 row3\" >7.229e+00 < x <= 7.250e+00</th>\n",
       "      <td id=\"T_2ea6b_row3_col0\" class=\"data row3 col0\" >0.0909</td>\n",
       "      <td id=\"T_2ea6b_row3_col1\" class=\"data row3 col1\" >0.0185</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row4\" class=\"row_heading level0 row4\" >7.250e+00 < x <= 7.750e+00</th>\n",
       "      <td id=\"T_2ea6b_row4_col0\" class=\"data row4 col0\" >0.3500</td>\n",
       "      <td id=\"T_2ea6b_row4_col1\" class=\"data row4 col1\" >0.0673</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row5\" class=\"row_heading level0 row5\" >7.750e+00 < x <= 7.854e+00</th>\n",
       "      <td id=\"T_2ea6b_row5_col0\" class=\"data row5 col0\" >0.3333</td>\n",
       "      <td id=\"T_2ea6b_row5_col1\" class=\"data row5 col1\" >0.0404</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row6\" class=\"row_heading level0 row6\" >7.854e+00 < x <= 7.896e+00</th>\n",
       "      <td id=\"T_2ea6b_row6_col0\" class=\"data row6 col0\" >0.1429</td>\n",
       "      <td id=\"T_2ea6b_row6_col1\" class=\"data row6 col1\" >0.0471</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row7\" class=\"row_heading level0 row7\" >7.896e+00 < x <= 8.029e+00</th>\n",
       "      <td id=\"T_2ea6b_row7_col0\" class=\"data row7 col0\" >0.5000</td>\n",
       "      <td id=\"T_2ea6b_row7_col1\" class=\"data row7 col1\" >0.0269</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row8\" class=\"row_heading level0 row8\" >8.029e+00 < x <= 8.050e+00</th>\n",
       "      <td id=\"T_2ea6b_row8_col0\" class=\"data row8 col0\" >0.0968</td>\n",
       "      <td id=\"T_2ea6b_row8_col1\" class=\"data row8 col1\" >0.0522</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row9\" class=\"row_heading level0 row9\" >8.050e+00 < x <= 9.000e+00</th>\n",
       "      <td id=\"T_2ea6b_row9_col0\" class=\"data row9 col0\" >0.1250</td>\n",
       "      <td id=\"T_2ea6b_row9_col1\" class=\"data row9 col1\" >0.0269</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row10\" class=\"row_heading level0 row10\" >9.000e+00 < x <= 9.842e+00</th>\n",
       "      <td id=\"T_2ea6b_row10_col0\" class=\"data row10 col0\" >0.3571</td>\n",
       "      <td id=\"T_2ea6b_row10_col1\" class=\"data row10 col1\" >0.0236</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row11\" class=\"row_heading level0 row11\" >9.842e+00 < x <= 1.050e+01</th>\n",
       "      <td id=\"T_2ea6b_row11_col0\" class=\"data row11 col0\" >0.3571</td>\n",
       "      <td id=\"T_2ea6b_row11_col1\" class=\"data row11 col1\" >0.0236</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row12\" class=\"row_heading level0 row12\" >1.050e+01 < x <= 1.300e+01</th>\n",
       "      <td id=\"T_2ea6b_row12_col0\" class=\"data row12 col0\" >0.5128</td>\n",
       "      <td id=\"T_2ea6b_row12_col1\" class=\"data row12 col1\" >0.0657</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row13\" class=\"row_heading level0 row13\" >1.300e+01 < x <= 1.445e+01</th>\n",
       "      <td id=\"T_2ea6b_row13_col0\" class=\"data row13 col0\" >0.3333</td>\n",
       "      <td id=\"T_2ea6b_row13_col1\" class=\"data row13 col1\" >0.0253</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row14\" class=\"row_heading level0 row14\" >1.445e+01 < x <= 1.550e+01</th>\n",
       "      <td id=\"T_2ea6b_row14_col0\" class=\"data row14 col0\" >0.2000</td>\n",
       "      <td id=\"T_2ea6b_row14_col1\" class=\"data row14 col1\" >0.0253</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row15\" class=\"row_heading level0 row15\" >1.550e+01 < x <= 1.670e+01</th>\n",
       "      <td id=\"T_2ea6b_row15_col0\" class=\"data row15 col0\" >0.5833</td>\n",
       "      <td id=\"T_2ea6b_row15_col1\" class=\"data row15 col1\" >0.0202</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row16\" class=\"row_heading level0 row16\" >1.670e+01 < x <= 2.025e+01</th>\n",
       "      <td id=\"T_2ea6b_row16_col0\" class=\"data row16 col0\" >0.5714</td>\n",
       "      <td id=\"T_2ea6b_row16_col1\" class=\"data row16 col1\" >0.0236</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row17\" class=\"row_heading level0 row17\" >2.025e+01 < x <= 2.300e+01</th>\n",
       "      <td id=\"T_2ea6b_row17_col0\" class=\"data row17 col0\" >0.4286</td>\n",
       "      <td id=\"T_2ea6b_row17_col1\" class=\"data row17 col1\" >0.0236</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row18\" class=\"row_heading level0 row18\" >2.300e+01 < x <= 2.600e+01</th>\n",
       "      <td id=\"T_2ea6b_row18_col0\" class=\"data row18 col0\" >0.3333</td>\n",
       "      <td id=\"T_2ea6b_row18_col1\" class=\"data row18 col1\" >0.0606</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row19\" class=\"row_heading level0 row19\" >2.600e+01 < x <= 2.655e+01</th>\n",
       "      <td id=\"T_2ea6b_row19_col0\" class=\"data row19 col0\" >0.5789</td>\n",
       "      <td id=\"T_2ea6b_row19_col1\" class=\"data row19 col1\" >0.0320</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row20\" class=\"row_heading level0 row20\" >2.655e+01 < x <= 2.790e+01</th>\n",
       "      <td id=\"T_2ea6b_row20_col0\" class=\"data row20 col0\" >0.2500</td>\n",
       "      <td id=\"T_2ea6b_row20_col1\" class=\"data row20 col1\" >0.0202</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row21\" class=\"row_heading level0 row21\" >2.790e+01 < x <= 3.000e+01</th>\n",
       "      <td id=\"T_2ea6b_row21_col0\" class=\"data row21 col0\" >0.4615</td>\n",
       "      <td id=\"T_2ea6b_row21_col1\" class=\"data row21 col1\" >0.0219</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row22\" class=\"row_heading level0 row22\" >3.000e+01 < x <= 3.139e+01</th>\n",
       "      <td id=\"T_2ea6b_row22_col0\" class=\"data row22 col0\" >0.3333</td>\n",
       "      <td id=\"T_2ea6b_row22_col1\" class=\"data row22 col1\" >0.0253</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row23\" class=\"row_heading level0 row23\" >3.139e+01 < x <= 3.850e+01</th>\n",
       "      <td id=\"T_2ea6b_row23_col0\" class=\"data row23 col0\" >0.2857</td>\n",
       "      <td id=\"T_2ea6b_row23_col1\" class=\"data row23 col1\" >0.0236</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row24\" class=\"row_heading level0 row24\" >3.850e+01 < x <= 4.240e+01</th>\n",
       "      <td id=\"T_2ea6b_row24_col0\" class=\"data row24 col0\" >0.4667</td>\n",
       "      <td id=\"T_2ea6b_row24_col1\" class=\"data row24 col1\" >0.0253</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row25\" class=\"row_heading level0 row25\" >4.240e+01 < x <= 5.200e+01</th>\n",
       "      <td id=\"T_2ea6b_row25_col0\" class=\"data row25 col0\" >0.2353</td>\n",
       "      <td id=\"T_2ea6b_row25_col1\" class=\"data row25 col1\" >0.0286</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row26\" class=\"row_heading level0 row26\" >5.200e+01 < x <= 5.650e+01</th>\n",
       "      <td id=\"T_2ea6b_row26_col0\" class=\"data row26 col0\" >0.7857</td>\n",
       "      <td id=\"T_2ea6b_row26_col1\" class=\"data row26 col1\" >0.0236</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row27\" class=\"row_heading level0 row27\" >5.650e+01 < x <= 6.955e+01</th>\n",
       "      <td id=\"T_2ea6b_row27_col0\" class=\"data row27 col0\" >0.5333</td>\n",
       "      <td id=\"T_2ea6b_row27_col1\" class=\"data row27 col1\" >0.0253</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row28\" class=\"row_heading level0 row28\" >6.955e+01 < x <= 7.729e+01</th>\n",
       "      <td id=\"T_2ea6b_row28_col0\" class=\"data row28 col0\" >0.4167</td>\n",
       "      <td id=\"T_2ea6b_row28_col1\" class=\"data row28 col1\" >0.0202</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row29\" class=\"row_heading level0 row29\" >7.729e+01 < x <= 8.316e+01</th>\n",
       "      <td id=\"T_2ea6b_row29_col0\" class=\"data row29 col0\" >0.8000</td>\n",
       "      <td id=\"T_2ea6b_row29_col1\" class=\"data row29 col1\" >0.0253</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row30\" class=\"row_heading level0 row30\" >8.316e+01 < x <= 1.109e+02</th>\n",
       "      <td id=\"T_2ea6b_row30_col0\" class=\"data row30 col0\" >0.7857</td>\n",
       "      <td id=\"T_2ea6b_row30_col1\" class=\"data row30 col1\" >0.0236</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row31\" class=\"row_heading level0 row31\" >1.109e+02 < x <= 1.516e+02</th>\n",
       "      <td id=\"T_2ea6b_row31_col0\" class=\"data row31 col0\" >0.8750</td>\n",
       "      <td id=\"T_2ea6b_row31_col1\" class=\"data row31 col1\" >0.0269</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_2ea6b_level0_row32\" class=\"row_heading level0 row32\" >1.516e+02 < x</th>\n",
       "      <td id=\"T_2ea6b_row32_col0\" class=\"data row32 col0\" >0.7143</td>\n",
       "      <td id=\"T_2ea6b_row32_col1\" class=\"data row32 col1\" >0.0236</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "          <style type=\"text/css\">\n",
       "#T_643d5_row0_col0 {\n",
       "  background-color: #6384eb;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_643d5_row0_col1, #T_643d5_row2_col0, #T_643d5_row4_col0, #T_643d5_row13_col0, #T_643d5_row16_col1, #T_643d5_row27_col1, #T_643d5_row30_col1 {\n",
       "  background-color: #9bbcff;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_643d5_row1_col0, #T_643d5_row3_col0, #T_643d5_row3_col1, #T_643d5_row10_col0, #T_643d5_row24_col0 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_643d5_row1_col1, #T_643d5_row7_col1, #T_643d5_row24_col1, #T_643d5_row28_col1 {\n",
       "  background-color: #465ecf;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_643d5_row2_col1, #T_643d5_row10_col1, #T_643d5_row22_col1, #T_643d5_row23_col1 {\n",
       "  background-color: #8db0fe;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_643d5_row4_col1, #T_643d5_row15_col0, #T_643d5_row23_col0, #T_643d5_row31_col0 {\n",
       "  background-color: #efcebd;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_643d5_row5_col0 {\n",
       "  background-color: #6b8df0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_643d5_row5_col1 {\n",
       "  background-color: #e7d7ce;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_643d5_row6_col0, #T_643d5_row13_col1, #T_643d5_row31_col1 {\n",
       "  background-color: #5470de;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_643d5_row6_col1, #T_643d5_row11_col1 {\n",
       "  background-color: #dddcdc;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_643d5_row7_col0 {\n",
       "  background-color: #bcd2f7;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_643d5_row8_col0, #T_643d5_row9_col0 {\n",
       "  background-color: #799cf8;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_643d5_row8_col1, #T_643d5_row9_col1 {\n",
       "  background-color: #c5d6f2;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_643d5_row11_col0, #T_643d5_row15_col1, #T_643d5_row17_col1 {\n",
       "  background-color: #aac7fd;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_643d5_row12_col0 {\n",
       "  background-color: #cedaeb;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_643d5_row12_col1, #T_643d5_row19_col0 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_643d5_row14_col0 {\n",
       "  background-color: #e4d9d2;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_643d5_row14_col1, #T_643d5_row32_col1 {\n",
       "  background-color: #b9d0f9;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_643d5_row16_col0 {\n",
       "  background-color: #e1dad6;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_643d5_row17_col0, #T_643d5_row25_col0 {\n",
       "  background-color: #f7ac8e;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_643d5_row18_col0 {\n",
       "  background-color: #f4c6af;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_643d5_row18_col1 {\n",
       "  background-color: #f5c4ac;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_643d5_row19_col1, #T_643d5_row29_col1 {\n",
       "  background-color: #7ea1fa;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_643d5_row20_col0 {\n",
       "  background-color: #86a9fc;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_643d5_row20_col1, #T_643d5_row21_col1, #T_643d5_row25_col1 {\n",
       "  background-color: #6282ea;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_643d5_row21_col0 {\n",
       "  background-color: #d4dbe6;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_643d5_row22_col0 {\n",
       "  background-color: #f6a283;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_643d5_row26_col0, #T_643d5_row28_col0 {\n",
       "  background-color: #f18d6f;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_643d5_row26_col1 {\n",
       "  background-color: #6f92f3;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_643d5_row27_col0 {\n",
       "  background-color: #f6bea4;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_643d5_row29_col0 {\n",
       "  background-color: #e7745b;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_643d5_row30_col0 {\n",
       "  background-color: #d44e41;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_643d5_row32_col0 {\n",
       "  background-color: #e36c55;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_643d5\" style='display:inline'>\n",
       "  <caption>X_dev distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th id=\"T_643d5_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_643d5_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row0_col0\" class=\"data row0 col0\" >0.1111</td>\n",
       "      <td id=\"T_643d5_row0_col1\" class=\"data row0 col1\" >0.0307</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row1_col0\" class=\"data row1 col0\" >0.0000</td>\n",
       "      <td id=\"T_643d5_row1_col1\" class=\"data row1 col1\" >0.0102</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row2_col0\" class=\"data row2 col0\" >0.2500</td>\n",
       "      <td id=\"T_643d5_row2_col1\" class=\"data row2 col1\" >0.0273</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row3_col0\" class=\"data row3 col0\" >0.0000</td>\n",
       "      <td id=\"T_643d5_row3_col1\" class=\"data row3 col1\" >0.0068</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row4_col0\" class=\"data row4 col0\" >0.2500</td>\n",
       "      <td id=\"T_643d5_row4_col1\" class=\"data row4 col1\" >0.0546</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row5_col0\" class=\"data row5 col0\" >0.1333</td>\n",
       "      <td id=\"T_643d5_row5_col1\" class=\"data row5 col1\" >0.0512</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row6_col0\" class=\"data row6 col0\" >0.0714</td>\n",
       "      <td id=\"T_643d5_row6_col1\" class=\"data row6 col1\" >0.0478</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row7_col0\" class=\"data row7 col0\" >0.3333</td>\n",
       "      <td id=\"T_643d5_row7_col1\" class=\"data row7 col1\" >0.0102</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row8_col0\" class=\"data row8 col0\" >0.1667</td>\n",
       "      <td id=\"T_643d5_row8_col1\" class=\"data row8 col1\" >0.0410</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row9_col0\" class=\"data row9 col0\" >0.1667</td>\n",
       "      <td id=\"T_643d5_row9_col1\" class=\"data row9 col1\" >0.0410</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row10_col0\" class=\"data row10 col0\" >0.0000</td>\n",
       "      <td id=\"T_643d5_row10_col1\" class=\"data row10 col1\" >0.0273</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row11_col0\" class=\"data row11 col0\" >0.2857</td>\n",
       "      <td id=\"T_643d5_row11_col1\" class=\"data row11 col1\" >0.0478</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row12_col0\" class=\"data row12 col0\" >0.3846</td>\n",
       "      <td id=\"T_643d5_row12_col1\" class=\"data row12 col1\" >0.0887</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row13_col0\" class=\"data row13 col0\" >0.2500</td>\n",
       "      <td id=\"T_643d5_row13_col1\" class=\"data row13 col1\" >0.0137</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row14_col0\" class=\"data row14 col0\" >0.4545</td>\n",
       "      <td id=\"T_643d5_row14_col1\" class=\"data row14 col1\" >0.0375</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row15_col0\" class=\"data row15 col0\" >0.5000</td>\n",
       "      <td id=\"T_643d5_row15_col1\" class=\"data row15 col1\" >0.0341</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row16_col0\" class=\"data row16 col0\" >0.4444</td>\n",
       "      <td id=\"T_643d5_row16_col1\" class=\"data row16 col1\" >0.0307</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row17_col0\" class=\"data row17 col0\" >0.6000</td>\n",
       "      <td id=\"T_643d5_row17_col1\" class=\"data row17 col1\" >0.0341</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row18_col0\" class=\"data row18 col0\" >0.5294</td>\n",
       "      <td id=\"T_643d5_row18_col1\" class=\"data row18 col1\" >0.0580</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row19_col0\" class=\"data row19 col0\" >0.8571</td>\n",
       "      <td id=\"T_643d5_row19_col1\" class=\"data row19 col1\" >0.0239</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row20_col0\" class=\"data row20 col0\" >0.2000</td>\n",
       "      <td id=\"T_643d5_row20_col1\" class=\"data row20 col1\" >0.0171</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row21_col0\" class=\"data row21 col0\" >0.4000</td>\n",
       "      <td id=\"T_643d5_row21_col1\" class=\"data row21 col1\" >0.0171</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row22_col0\" class=\"data row22 col0\" >0.6250</td>\n",
       "      <td id=\"T_643d5_row22_col1\" class=\"data row22 col1\" >0.0273</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row23_col0\" class=\"data row23 col0\" >0.5000</td>\n",
       "      <td id=\"T_643d5_row23_col1\" class=\"data row23 col1\" >0.0273</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row24_col0\" class=\"data row24 col0\" >0.0000</td>\n",
       "      <td id=\"T_643d5_row24_col1\" class=\"data row24 col1\" >0.0102</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row25_col0\" class=\"data row25 col0\" >0.6000</td>\n",
       "      <td id=\"T_643d5_row25_col1\" class=\"data row25 col1\" >0.0171</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row26_col0\" class=\"data row26 col0\" >0.6667</td>\n",
       "      <td id=\"T_643d5_row26_col1\" class=\"data row26 col1\" >0.0205</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row27_col0\" class=\"data row27 col0\" >0.5556</td>\n",
       "      <td id=\"T_643d5_row27_col1\" class=\"data row27 col1\" >0.0307</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row28_col0\" class=\"data row28 col0\" >0.6667</td>\n",
       "      <td id=\"T_643d5_row28_col1\" class=\"data row28 col1\" >0.0102</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row29_col0\" class=\"data row29 col0\" >0.7143</td>\n",
       "      <td id=\"T_643d5_row29_col1\" class=\"data row29 col1\" >0.0239</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row30_col0\" class=\"data row30 col0\" >0.7778</td>\n",
       "      <td id=\"T_643d5_row30_col1\" class=\"data row30 col1\" >0.0307</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row31_col0\" class=\"data row31 col0\" >0.5000</td>\n",
       "      <td id=\"T_643d5_row31_col1\" class=\"data row31 col1\" >0.0137</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_643d5_row32_col0\" class=\"data row32 col0\" >0.7273</td>\n",
       "      <td id=\"T_643d5_row32_col1\" class=\"data row32 col1\" >0.0375</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Grouping modalities   : 100%|█████████▉| 41447/41448 [00:06<00:00, 6897.10it/s]\n",
      "Computing associations: 100%|██████████| 41448/41448 [00:09<00:00, 4172.63it/s]\n",
      "Testing robustness    :   0%|          | 0/41448 [00:00<?, ?it/s]\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      " [BinaryCarver] Carved distribution\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_6b676_row0_col0, #T_6b676_row1_col1 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_6b676_row0_col1, #T_6b676_row1_col0 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_6b676\" style='display:inline'>\n",
       "  <caption>X distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"blank level0\" >&nbsp;</th>\n",
       "      <th id=\"T_6b676_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_6b676_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_6b676_level0_row0\" class=\"row_heading level0 row0\" >x <= 5.2e+01</th>\n",
       "      <td id=\"T_6b676_row0_col0\" class=\"data row0 col0\" >0.3198</td>\n",
       "      <td id=\"T_6b676_row0_col1\" class=\"data row0 col1\" >0.8316</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_6b676_level0_row1\" class=\"row_heading level0 row1\" >5.2e+01 < x</th>\n",
       "      <td id=\"T_6b676_row1_col0\" class=\"data row1 col0\" >0.7100</td>\n",
       "      <td id=\"T_6b676_row1_col1\" class=\"data row1 col1\" >0.1684</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "          <style type=\"text/css\">\n",
       "#T_59762_row0_col0, #T_59762_row1_col1 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_59762_row0_col1, #T_59762_row1_col0 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_59762\" style='display:inline'>\n",
       "  <caption>X_dev distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th id=\"T_59762_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_59762_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td id=\"T_59762_row0_col0\" class=\"data row0 col0\" >0.3279</td>\n",
       "      <td id=\"T_59762_row0_col1\" class=\"data row0 col1\" >0.8328</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_59762_row1_col0\" class=\"data row1 col0\" >0.6735</td>\n",
       "      <td id=\"T_59762_row1_col1\" class=\"data row1 col1\" >0.1672</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--- [BinaryCarver] Fit Quantitative('Siblings/Spouses Aboard') (5/6)\n",
      " [BinaryCarver] Raw distribution\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_7a1f6_row0_col0 {\n",
       "  background-color: #f7ba9f;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_7a1f6_row0_col1, #T_7a1f6_row2_col0 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_7a1f6_row1_col0 {\n",
       "  background-color: #d44e41;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_7a1f6_row1_col1 {\n",
       "  background-color: #a6c4fe;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_7a1f6_row2_col1 {\n",
       "  background-color: #4055c8;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_7a1f6_row3_col0 {\n",
       "  background-color: #90b2fe;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_7a1f6_row3_col1 {\n",
       "  background-color: #4257c9;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_7a1f6_row4_col0, #T_7a1f6_row4_col1 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_7a1f6\" style='display:inline'>\n",
       "  <caption>X distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"blank level0\" >&nbsp;</th>\n",
       "      <th id=\"T_7a1f6_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_7a1f6_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_7a1f6_level0_row0\" class=\"row_heading level0 row0\" >x <= 0.00e+00</th>\n",
       "      <td id=\"T_7a1f6_row0_col0\" class=\"data row0 col0\" >0.3614</td>\n",
       "      <td id=\"T_7a1f6_row0_col1\" class=\"data row0 col1\" >0.6801</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_7a1f6_level0_row1\" class=\"row_heading level0 row1\" >0.00e+00 < x <= 1.00e+00</th>\n",
       "      <td id=\"T_7a1f6_row1_col0\" class=\"data row1 col0\" >0.5000</td>\n",
       "      <td id=\"T_7a1f6_row1_col1\" class=\"data row1 col1\" >0.2323</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_7a1f6_level0_row2\" class=\"row_heading level0 row2\" >1.00e+00 < x <= 2.00e+00</th>\n",
       "      <td id=\"T_7a1f6_row2_col0\" class=\"data row2 col0\" >0.5500</td>\n",
       "      <td id=\"T_7a1f6_row2_col1\" class=\"data row2 col1\" >0.0337</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_7a1f6_level0_row3\" class=\"row_heading level0 row3\" >2.00e+00 < x <= 4.00e+00</th>\n",
       "      <td id=\"T_7a1f6_row3_col0\" class=\"data row3 col0\" >0.1429</td>\n",
       "      <td id=\"T_7a1f6_row3_col1\" class=\"data row3 col1\" >0.0354</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_7a1f6_level0_row4\" class=\"row_heading level0 row4\" >4.00e+00 < x</th>\n",
       "      <td id=\"T_7a1f6_row4_col0\" class=\"data row4 col0\" >0.0000</td>\n",
       "      <td id=\"T_7a1f6_row4_col1\" class=\"data row4 col1\" >0.0185</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "          <style type=\"text/css\">\n",
       "#T_f0b20_row0_col0 {\n",
       "  background-color: #e4d9d2;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_f0b20_row0_col1, #T_f0b20_row1_col0 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_f0b20_row1_col1 {\n",
       "  background-color: #b1cbfc;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_f0b20_row2_col0 {\n",
       "  background-color: #c4d5f3;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_f0b20_row2_col1 {\n",
       "  background-color: #455cce;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_f0b20_row3_col0 {\n",
       "  background-color: #dfdbd9;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_f0b20_row3_col1 {\n",
       "  background-color: #4c66d6;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_f0b20_row4_col0, #T_f0b20_row4_col1 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_f0b20\" style='display:inline'>\n",
       "  <caption>X_dev distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th id=\"T_f0b20_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_f0b20_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td id=\"T_f0b20_row0_col0\" class=\"data row0 col0\" >0.3200</td>\n",
       "      <td id=\"T_f0b20_row0_col1\" class=\"data row0 col1\" >0.6826</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_f0b20_row1_col0\" class=\"data row1 col0\" >0.6056</td>\n",
       "      <td id=\"T_f0b20_row1_col1\" class=\"data row1 col1\" >0.2423</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_f0b20_row2_col0\" class=\"data row2 col0\" >0.2500</td>\n",
       "      <td id=\"T_f0b20_row2_col1\" class=\"data row2 col1\" >0.0273</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_f0b20_row3_col0\" class=\"data row3 col0\" >0.3077</td>\n",
       "      <td id=\"T_f0b20_row3_col1\" class=\"data row3 col1\" >0.0444</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_f0b20_row4_col0\" class=\"data row4 col0\" >0.0000</td>\n",
       "      <td id=\"T_f0b20_row4_col1\" class=\"data row4 col1\" >0.0034</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Grouping modalities   :  93%|█████████▎| 14/15 [00:00<00:00, 4605.87it/s]\n",
      "Computing associations: 100%|██████████| 15/15 [00:00<00:00, 1820.44it/s]\n",
      "Testing robustness    :  67%|██████▋   | 10/15 [00:00<00:00, 322.91it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      " [BinaryCarver] Carved distribution\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_b1f25_row0_col0 {\n",
       "  background-color: #c0d4f5;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_b1f25_row0_col1, #T_b1f25_row1_col0 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_b1f25_row1_col1 {\n",
       "  background-color: #8badfd;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_b1f25_row2_col0, #T_b1f25_row2_col1 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_b1f25\" style='display:inline'>\n",
       "  <caption>X distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"blank level0\" >&nbsp;</th>\n",
       "      <th id=\"T_b1f25_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_b1f25_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_b1f25_level0_row0\" class=\"row_heading level0 row0\" >x <= 0.00e+00</th>\n",
       "      <td id=\"T_b1f25_row0_col0\" class=\"data row0 col0\" >0.3614</td>\n",
       "      <td id=\"T_b1f25_row0_col1\" class=\"data row0 col1\" >0.6801</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_b1f25_level0_row1\" class=\"row_heading level0 row1\" >0.00e+00 < x <= 1.00e+00</th>\n",
       "      <td id=\"T_b1f25_row1_col0\" class=\"data row1 col0\" >0.5000</td>\n",
       "      <td id=\"T_b1f25_row1_col1\" class=\"data row1 col1\" >0.2323</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_b1f25_level0_row2\" class=\"row_heading level0 row2\" >1.00e+00 < x</th>\n",
       "      <td id=\"T_b1f25_row2_col0\" class=\"data row2 col0\" >0.2692</td>\n",
       "      <td id=\"T_b1f25_row2_col1\" class=\"data row2 col1\" >0.0875</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "          <style type=\"text/css\">\n",
       "#T_f64ab_row0_col0 {\n",
       "  background-color: #6788ee;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_f64ab_row0_col1, #T_f64ab_row1_col0 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_f64ab_row1_col1 {\n",
       "  background-color: #96b7ff;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_f64ab_row2_col0, #T_f64ab_row2_col1 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_f64ab\" style='display:inline'>\n",
       "  <caption>X_dev distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th id=\"T_f64ab_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_f64ab_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td id=\"T_f64ab_row0_col0\" class=\"data row0 col0\" >0.3200</td>\n",
       "      <td id=\"T_f64ab_row0_col1\" class=\"data row0 col1\" >0.6826</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_f64ab_row1_col0\" class=\"data row1 col0\" >0.6056</td>\n",
       "      <td id=\"T_f64ab_row1_col1\" class=\"data row1 col1\" >0.2423</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_f64ab_row2_col0\" class=\"data row2 col0\" >0.2727</td>\n",
       "      <td id=\"T_f64ab_row2_col1\" class=\"data row2 col1\" >0.0751</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--- [BinaryCarver] Fit Quantitative('Parents/Children Aboard') (6/6)\n",
      " [BinaryCarver] Raw distribution\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_e8a40_row0_col0 {\n",
       "  background-color: #4c66d6;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_e8a40_row0_col1, #T_e8a40_row2_col0 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_e8a40_row1_col0 {\n",
       "  background-color: #ca3b37;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_e8a40_row1_col1 {\n",
       "  background-color: #7597f6;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_e8a40_row2_col1 {\n",
       "  background-color: #5f7fe8;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_e8a40_row3_col0, #T_e8a40_row3_col1 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_e8a40\" style='display:inline'>\n",
       "  <caption>X distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"blank level0\" >&nbsp;</th>\n",
       "      <th id=\"T_e8a40_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_e8a40_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_e8a40_level0_row0\" class=\"row_heading level0 row0\" >x <= 0.00e+00</th>\n",
       "      <td id=\"T_e8a40_row0_col0\" class=\"data row0 col0\" >0.3447</td>\n",
       "      <td id=\"T_e8a40_row0_col1\" class=\"data row0 col1\" >0.7374</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_e8a40_level0_row1\" class=\"row_heading level0 row1\" >0.00e+00 < x <= 1.00e+00</th>\n",
       "      <td id=\"T_e8a40_row1_col0\" class=\"data row1 col0\" >0.5057</td>\n",
       "      <td id=\"T_e8a40_row1_col1\" class=\"data row1 col1\" >0.1465</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_e8a40_level0_row2\" class=\"row_heading level0 row2\" >1.00e+00 < x <= 2.00e+00</th>\n",
       "      <td id=\"T_e8a40_row2_col0\" class=\"data row2 col0\" >0.5167</td>\n",
       "      <td id=\"T_e8a40_row2_col1\" class=\"data row2 col1\" >0.1010</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_e8a40_level0_row3\" class=\"row_heading level0 row3\" >2.00e+00 < x</th>\n",
       "      <td id=\"T_e8a40_row3_col0\" class=\"data row3 col0\" >0.3333</td>\n",
       "      <td id=\"T_e8a40_row3_col1\" class=\"data row3 col1\" >0.0152</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "          <style type=\"text/css\">\n",
       "#T_0fed3_row0_col0 {\n",
       "  background-color: #b1cbfc;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_0fed3_row0_col1, #T_0fed3_row1_col0 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_0fed3_row1_col1 {\n",
       "  background-color: #5b7ae5;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_0fed3_row2_col0 {\n",
       "  background-color: #ead4c8;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_0fed3_row2_col1 {\n",
       "  background-color: #4c66d6;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_0fed3_row3_col0, #T_0fed3_row3_col1 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_0fed3\" style='display:inline'>\n",
       "  <caption>X_dev distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th id=\"T_0fed3_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_0fed3_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td id=\"T_0fed3_row0_col0\" class=\"data row0 col0\" >0.3475</td>\n",
       "      <td id=\"T_0fed3_row0_col1\" class=\"data row0 col1\" >0.8055</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_0fed3_row1_col0\" class=\"data row1 col0\" >0.6774</td>\n",
       "      <td id=\"T_0fed3_row1_col1\" class=\"data row1 col1\" >0.1058</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_0fed3_row2_col0\" class=\"data row2 col0\" >0.4500</td>\n",
       "      <td id=\"T_0fed3_row2_col1\" class=\"data row2 col1\" >0.0683</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_0fed3_row3_col0\" class=\"data row3 col0\" >0.1667</td>\n",
       "      <td id=\"T_0fed3_row3_col1\" class=\"data row3 col1\" >0.0205</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Grouping modalities   :  86%|████████▌ | 6/7 [00:00<00:00, 604.67it/s]\n",
      "Computing associations: 100%|██████████| 7/7 [00:00<00:00, 932.45it/s]\n",
      "Testing robustness    :   0%|          | 0/7 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      " [BinaryCarver] Carved distribution\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_d12f1_row0_col0, #T_d12f1_row1_col1 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_d12f1_row0_col1, #T_d12f1_row1_col0 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_d12f1\" style='display:inline'>\n",
       "  <caption>X distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"blank level0\" >&nbsp;</th>\n",
       "      <th id=\"T_d12f1_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_d12f1_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_d12f1_level0_row0\" class=\"row_heading level0 row0\" >x <= 0.0e+00</th>\n",
       "      <td id=\"T_d12f1_row0_col0\" class=\"data row0 col0\" >0.3447</td>\n",
       "      <td id=\"T_d12f1_row0_col1\" class=\"data row0 col1\" >0.7374</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_d12f1_level0_row1\" class=\"row_heading level0 row1\" >0.0e+00 < x</th>\n",
       "      <td id=\"T_d12f1_row1_col0\" class=\"data row1 col0\" >0.5000</td>\n",
       "      <td id=\"T_d12f1_row1_col1\" class=\"data row1 col1\" >0.2626</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "          <style type=\"text/css\">\n",
       "#T_8677f_row0_col0, #T_8677f_row1_col1 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8677f_row0_col1, #T_8677f_row1_col0 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_8677f\" style='display:inline'>\n",
       "  <caption>X_dev distribution</caption>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th id=\"T_8677f_level0_col0\" class=\"col_heading level0 col0\" >target_mean</th>\n",
       "      <th id=\"T_8677f_level0_col1\" class=\"col_heading level0 col1\" >frequency</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td id=\"T_8677f_row0_col0\" class=\"data row0 col0\" >0.3475</td>\n",
       "      <td id=\"T_8677f_row0_col1\" class=\"data row0 col1\" >0.8055</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_8677f_row1_col0\" class=\"data row1 col0\" >0.5439</td>\n",
       "      <td id=\"T_8677f_row1_col1\" class=\"data row1 col1\" >0.1945</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\defra\\Desktop\\git\\PROJECTS\\AutoCarver\\AutoCarver\\discretizers\\utils\\base_discretizer.py:433: FutureWarning: Downcasting behavior in `replace` is deprecated and will be removed in a future version. To retain the old behavior, explicitly call `result.infer_objects(copy=False)`. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`\n",
      "  sample.X.replace(\n"
     ]
    }
   ],
   "source": [
    "from AutoCarver import BinaryCarver\n",
    "\n",
    "# intiating AutoCarver\n",
    "auto_carver = BinaryCarver(\n",
    "    features=features,\n",
    "    min_freq=min_freq,\n",
    "    dropna=dropna,\n",
    "    verbose=True,  # showing statistics\n",
    "    copy=True,  # whether or not to return a copy of the input dataset\n",
    ")\n",
    "\n",
    "# fitting on training sample, a dev sample can be specified to evaluate carving robustness\n",
    "train_set_processed = auto_carver.fit_transform(train_set, train_set[target], X_dev=dev_set, y_dev=dev_set[target])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## AutoCarver analysis\n",
    "\n",
    "### Carving Summary"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>content</th>\n",
       "      <th>frequency</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>feature</th>\n",
       "      <th>target_mean</th>\n",
       "      <th>cramerv</th>\n",
       "      <th>tschuprowt</th>\n",
       "      <th>n_mod</th>\n",
       "      <th>label</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">Categorical('Sex')</th>\n",
       "      <th>0.187831</th>\n",
       "      <th>0.533719</th>\n",
       "      <th>0.533719</th>\n",
       "      <th>2</th>\n",
       "      <th>0</th>\n",
       "      <td>male</td>\n",
       "      <td>0.636364</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0.731481</th>\n",
       "      <th>0.533719</th>\n",
       "      <th>0.533719</th>\n",
       "      <th>2</th>\n",
       "      <th>1</th>\n",
       "      <td>female</td>\n",
       "      <td>0.363636</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">Ordinal('Pclass')</th>\n",
       "      <th>0.548507</th>\n",
       "      <th>0.300144</th>\n",
       "      <th>0.300144</th>\n",
       "      <th>2</th>\n",
       "      <th>0</th>\n",
       "      <td>[2, 1]</td>\n",
       "      <td>0.451178</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0.251534</th>\n",
       "      <th>0.300144</th>\n",
       "      <th>0.300144</th>\n",
       "      <th>2</th>\n",
       "      <th>1</th>\n",
       "      <td>3</td>\n",
       "      <td>0.548822</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">Quantitative('Age')</th>\n",
       "      <th>0.636364</th>\n",
       "      <th>0.139166</th>\n",
       "      <th>0.139166</th>\n",
       "      <th>2</th>\n",
       "      <th>0</th>\n",
       "      <td>x &lt;= 8.0e+00</td>\n",
       "      <td>0.074074</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0.365455</th>\n",
       "      <th>0.139166</th>\n",
       "      <th>0.139166</th>\n",
       "      <th>2</th>\n",
       "      <th>1</th>\n",
       "      <td>8.0e+00 &lt; x</td>\n",
       "      <td>0.925926</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">Quantitative('Fare')</th>\n",
       "      <th>0.319838</th>\n",
       "      <th>0.295325</th>\n",
       "      <th>0.295325</th>\n",
       "      <th>2</th>\n",
       "      <th>0</th>\n",
       "      <td>x &lt;= 5.2e+01</td>\n",
       "      <td>0.831650</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0.710000</th>\n",
       "      <th>0.295325</th>\n",
       "      <th>0.295325</th>\n",
       "      <th>2</th>\n",
       "      <th>1</th>\n",
       "      <td>5.2e+01 &lt; x</td>\n",
       "      <td>0.168350</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"3\" valign=\"top\">Quantitative('Siblings/Spouses Aboard')</th>\n",
       "      <th>0.361386</th>\n",
       "      <th>0.139722</th>\n",
       "      <th>0.117492</th>\n",
       "      <th>3</th>\n",
       "      <th>0</th>\n",
       "      <td>x &lt;= 0.00e+00</td>\n",
       "      <td>0.680135</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0.500000</th>\n",
       "      <th>0.139722</th>\n",
       "      <th>0.117492</th>\n",
       "      <th>3</th>\n",
       "      <th>1</th>\n",
       "      <td>0.00e+00 &lt; x &lt;= 1.00e+00</td>\n",
       "      <td>0.232323</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0.269231</th>\n",
       "      <th>0.139722</th>\n",
       "      <th>0.117492</th>\n",
       "      <th>3</th>\n",
       "      <th>2</th>\n",
       "      <td>1.00e+00 &lt; x</td>\n",
       "      <td>0.087542</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">Quantitative('Parents/Children Aboard')</th>\n",
       "      <th>0.344749</th>\n",
       "      <th>0.136439</th>\n",
       "      <th>0.136439</th>\n",
       "      <th>2</th>\n",
       "      <th>0</th>\n",
       "      <td>x &lt;= 0.0e+00</td>\n",
       "      <td>0.737374</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0.500000</th>\n",
       "      <th>0.136439</th>\n",
       "      <th>0.136439</th>\n",
       "      <th>2</th>\n",
       "      <th>1</th>\n",
       "      <td>0.0e+00 &lt; x</td>\n",
       "      <td>0.262626</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                                                                      content  \\\n",
       "feature                                 target_mean cramerv  tschuprowt n_mod label                             \n",
       "Categorical('Sex')                      0.187831    0.533719 0.533719   2     0                          male   \n",
       "                                        0.731481    0.533719 0.533719   2     1                        female   \n",
       "Ordinal('Pclass')                       0.548507    0.300144 0.300144   2     0                        [2, 1]   \n",
       "                                        0.251534    0.300144 0.300144   2     1                             3   \n",
       "Quantitative('Age')                     0.636364    0.139166 0.139166   2     0                  x <= 8.0e+00   \n",
       "                                        0.365455    0.139166 0.139166   2     1                   8.0e+00 < x   \n",
       "Quantitative('Fare')                    0.319838    0.295325 0.295325   2     0                  x <= 5.2e+01   \n",
       "                                        0.710000    0.295325 0.295325   2     1                   5.2e+01 < x   \n",
       "Quantitative('Siblings/Spouses Aboard') 0.361386    0.139722 0.117492   3     0                 x <= 0.00e+00   \n",
       "                                        0.500000    0.139722 0.117492   3     1      0.00e+00 < x <= 1.00e+00   \n",
       "                                        0.269231    0.139722 0.117492   3     2                  1.00e+00 < x   \n",
       "Quantitative('Parents/Children Aboard') 0.344749    0.136439 0.136439   2     0                  x <= 0.0e+00   \n",
       "                                        0.500000    0.136439 0.136439   2     1                   0.0e+00 < x   \n",
       "\n",
       "                                                                                     frequency  \n",
       "feature                                 target_mean cramerv  tschuprowt n_mod label             \n",
       "Categorical('Sex')                      0.187831    0.533719 0.533719   2     0       0.636364  \n",
       "                                        0.731481    0.533719 0.533719   2     1       0.363636  \n",
       "Ordinal('Pclass')                       0.548507    0.300144 0.300144   2     0       0.451178  \n",
       "                                        0.251534    0.300144 0.300144   2     1       0.548822  \n",
       "Quantitative('Age')                     0.636364    0.139166 0.139166   2     0       0.074074  \n",
       "                                        0.365455    0.139166 0.139166   2     1       0.925926  \n",
       "Quantitative('Fare')                    0.319838    0.295325 0.295325   2     0       0.831650  \n",
       "                                        0.710000    0.295325 0.295325   2     1       0.168350  \n",
       "Quantitative('Siblings/Spouses Aboard') 0.361386    0.139722 0.117492   3     0       0.680135  \n",
       "                                        0.500000    0.139722 0.117492   3     1       0.232323  \n",
       "                                        0.269231    0.139722 0.117492   3     2       0.087542  \n",
       "Quantitative('Parents/Children Aboard') 0.344749    0.136439 0.136439   2     0       0.737374  \n",
       "                                        0.500000    0.136439 0.136439   2     1       0.262626  "
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "auto_carver.summary"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* For quantitative feature ``Age``, the selected combination of modalities groups ages as follows:\n",
    "    * modality ``0``: lower or equal to 8 years old (``content=\"x <= 8.0+00\"``)\n",
    "    * modality ``1``: ages higher than 8 years old (``content=\"8.0+00 < x \"``)\n",
    "\n",
    "* For qualitative categorical feature ``Sex``, the selected combination of modalities has left modalities ``content=\"male\"`` in modality ``0`` and ``content=\"female\"`` in modality ``1`` (no combination possible)\n",
    "\n",
    "* For qualitative ordinal feature ``Pclass``, the selected combination of modalities socio-economic status as follows:\n",
    "    * modality ``0``: upper and middle classes (``content=[2, 1]``) \n",
    "    * modality ``1``: lower class (``content=3``). \n",
    "    * The user-provided ordering of modalities has been preserved."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Detailed overview of tested combinations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>info</th>\n",
       "      <th>cramerv</th>\n",
       "      <th>tschuprowt</th>\n",
       "      <th>combination</th>\n",
       "      <th>n_mod</th>\n",
       "      <th>dropna</th>\n",
       "      <th>train</th>\n",
       "      <th>viable</th>\n",
       "      <th>dev</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Raw distribution</td>\n",
       "      <td>0.321044</td>\n",
       "      <td>0.269965</td>\n",
       "      <td>{'1': '1', '2': '2', '3': '3'}</td>\n",
       "      <td>3</td>\n",
       "      <td>False</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Best for tschuprowt and max_n_mod=5</td>\n",
       "      <td>0.300144</td>\n",
       "      <td>0.300144</td>\n",
       "      <td>{'1': '1', '2': '1', '3': '3'}</td>\n",
       "      <td>2</td>\n",
       "      <td>False</td>\n",
       "      <td>{'viable': True, 'info': ''}</td>\n",
       "      <td>True</td>\n",
       "      <td>{'viable': True, 'info': ''}</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Not checked</td>\n",
       "      <td>0.321044</td>\n",
       "      <td>0.269965</td>\n",
       "      <td>{'1': '1', '2': '2', '3': '3'}</td>\n",
       "      <td>3</td>\n",
       "      <td>False</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Not checked</td>\n",
       "      <td>0.265643</td>\n",
       "      <td>0.265643</td>\n",
       "      <td>{'1': '1', '2': '2', '3': '2'}</td>\n",
       "      <td>2</td>\n",
       "      <td>False</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                  info   cramerv  tschuprowt  \\\n",
       "0                     Raw distribution  0.321044    0.269965   \n",
       "1  Best for tschuprowt and max_n_mod=5  0.300144    0.300144   \n",
       "2                          Not checked  0.321044    0.269965   \n",
       "3                          Not checked  0.265643    0.265643   \n",
       "\n",
       "                      combination  n_mod  dropna  \\\n",
       "0  {'1': '1', '2': '2', '3': '3'}      3   False   \n",
       "1  {'1': '1', '2': '1', '3': '3'}      2   False   \n",
       "2  {'1': '1', '2': '2', '3': '3'}      3   False   \n",
       "3  {'1': '1', '2': '2', '3': '2'}      2   False   \n",
       "\n",
       "                          train viable                           dev  \n",
       "0                           NaN    NaN                           NaN  \n",
       "1  {'viable': True, 'info': ''}   True  {'viable': True, 'info': ''}  \n",
       "2                           NaN    NaN                           NaN  \n",
       "3                           NaN    NaN                           NaN  "
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "features[\"Pclass\"].history"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* The most associated combination (the first tested out, where ``info!=\"Raw distribution\"``) groups ``Pclass==1`` with ``Pclass==2`` and leaves ``Pclass==3`` as its own modality\n",
    "\n",
    "* For feature ``Pclass``, the 1st combination passes the tests:\n",
    "    - ``viable=True``\n",
    "    - ``info=\"Best for tschuprowt and max_n_mod=5\"``\n",
    "    - Tschuprow's T with ``Survived`` is ``0.300144`` for this combination (by default, combinations are ranked according to this statistic)\n",
    "    - Following combinations (less associated with the target) where not tested: ``info=\"Not checked\"``\n",
    "\n",
    "* For all combinations ``dropna=False`` means that it is not a combination in which ``nan``s are being grouped with other modalities (as requested with ``dropna=False``)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Saving and Loading AutoCarver\n",
    "### Saving\n",
    "\n",
    "All **Carvers** can safely be stored as a .json file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "auto_carver.save(\"binary_carver.json\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Loading\n",
    "\n",
    "**Carvers** can safely be loaded from a .json file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "auto_carver = BinaryCarver.load(\"binary_carver.json\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Applying AutoCarver"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\defra\\Desktop\\git\\PROJECTS\\AutoCarver\\AutoCarver\\discretizers\\utils\\base_discretizer.py:433: FutureWarning: Downcasting behavior in `replace` is deprecated and will be removed in a future version. To retain the old behavior, explicitly call `result.infer_objects(copy=False)`. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`\n",
      "  sample.X.replace(\n"
     ]
    }
   ],
   "source": [
    "dev_set_processed = auto_carver.transform(dev_set)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Sex</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Age</th>\n",
       "      <th>Fare</th>\n",
       "      <th>Siblings/Spouses Aboard</th>\n",
       "      <th>Parents/Children Aboard</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0.0</th>\n",
       "      <td>0.665529</td>\n",
       "      <td>0.450512</td>\n",
       "      <td>0.064846</td>\n",
       "      <td>0.832765</td>\n",
       "      <td>0.682594</td>\n",
       "      <td>0.805461</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1.0</th>\n",
       "      <td>0.334471</td>\n",
       "      <td>0.549488</td>\n",
       "      <td>0.935154</td>\n",
       "      <td>0.167235</td>\n",
       "      <td>0.242321</td>\n",
       "      <td>0.194539</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2.0</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.075085</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          Sex    Pclass       Age      Fare  Siblings/Spouses Aboard  \\\n",
       "0.0  0.665529  0.450512  0.064846  0.832765                 0.682594   \n",
       "1.0  0.334471  0.549488  0.935154  0.167235                 0.242321   \n",
       "2.0       NaN       NaN       NaN       NaN                 0.075085   \n",
       "\n",
       "     Parents/Children Aboard  \n",
       "0.0                 0.805461  \n",
       "1.0                 0.194539  \n",
       "2.0                      NaN  "
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dev_set_processed[auto_carver.features].apply(lambda u: u.value_counts(dropna=False, normalize=True))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Feature Selection\n",
    "## Selectors settings\n",
    "### Features to select from\n",
    "\n",
    "Here all features have been carved using ``BinaryCarver``, hence all features are qualitative.\n",
    "\n",
    "### Number of features to select\n",
    "\n",
    "The attribute ``n_best_per_type`` allows one to choose the number of features to be selected per data type (quantitative and qualitative)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
    "n_best_per_type = 4  # here the number of features is low, ClassificationSelector will only be used to compute useful statistics"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using Selectors"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " [ClassificationSelector] Selected Qualitative Features \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_bed89_row0_col3, #T_bed89_row4_col5 {\n",
       "  background-color: #b40426;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_bed89_row0_col5, #T_bed89_row5_col3 {\n",
       "  background-color: #3b4cc0;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_bed89_row1_col3 {\n",
       "  background-color: #ccd9ed;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_bed89_row1_col5, #T_bed89_row3_col5 {\n",
       "  background-color: #80a3fa;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_bed89_row2_col3 {\n",
       "  background-color: #c9d7f0;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_bed89_row2_col5 {\n",
       "  background-color: #e57058;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_bed89_row3_col3 {\n",
       "  background-color: #4a63d3;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_bed89_row4_col3 {\n",
       "  background-color: #485fd1;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_bed89_row5_col5 {\n",
       "  background-color: #df634e;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_bed89\" style='display:inline'>\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"blank level0\" >&nbsp;</th>\n",
       "      <th id=\"T_bed89_level0_col0\" class=\"col_heading level0 col0\" >feature</th>\n",
       "      <th id=\"T_bed89_level0_col1\" class=\"col_heading level0 col1\" >Nan</th>\n",
       "      <th id=\"T_bed89_level0_col2\" class=\"col_heading level0 col2\" >Mode</th>\n",
       "      <th id=\"T_bed89_level0_col3\" class=\"col_heading level0 col3\" >TschuprowtMeasure</th>\n",
       "      <th id=\"T_bed89_level0_col4\" class=\"col_heading level0 col4\" >TschuprowtRank</th>\n",
       "      <th id=\"T_bed89_level0_col5\" class=\"col_heading level0 col5\" >TschuprowtFilter</th>\n",
       "      <th id=\"T_bed89_level0_col6\" class=\"col_heading level0 col6\" >TschuprowtWith</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_bed89_level0_row0\" class=\"row_heading level0 row0\" >0</th>\n",
       "      <td id=\"T_bed89_row0_col0\" class=\"data row0 col0\" >Categorical('Sex')</td>\n",
       "      <td id=\"T_bed89_row0_col1\" class=\"data row0 col1\" >0.0000</td>\n",
       "      <td id=\"T_bed89_row0_col2\" class=\"data row0 col2\" >0.6364</td>\n",
       "      <td id=\"T_bed89_row0_col3\" class=\"data row0 col3\" >0.5337</td>\n",
       "      <td id=\"T_bed89_row0_col4\" class=\"data row0 col4\" >0.0000</td>\n",
       "      <td id=\"T_bed89_row0_col5\" class=\"data row0 col5\" >0.0000</td>\n",
       "      <td id=\"T_bed89_row0_col6\" class=\"data row0 col6\" >itself</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_bed89_level0_row1\" class=\"row_heading level0 row1\" >1</th>\n",
       "      <td id=\"T_bed89_row1_col0\" class=\"data row1 col0\" >Ordinal('Pclass')</td>\n",
       "      <td id=\"T_bed89_row1_col1\" class=\"data row1 col1\" >0.0000</td>\n",
       "      <td id=\"T_bed89_row1_col2\" class=\"data row1 col2\" >0.5488</td>\n",
       "      <td id=\"T_bed89_row1_col3\" class=\"data row1 col3\" >0.3001</td>\n",
       "      <td id=\"T_bed89_row1_col4\" class=\"data row1 col4\" >1.0000</td>\n",
       "      <td id=\"T_bed89_row1_col5\" class=\"data row1 col5\" >0.0988</td>\n",
       "      <td id=\"T_bed89_row1_col6\" class=\"data row1 col6\" >Sex</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_bed89_level0_row2\" class=\"row_heading level0 row2\" >3</th>\n",
       "      <td id=\"T_bed89_row2_col0\" class=\"data row2 col0\" >Quantitative('Fare')</td>\n",
       "      <td id=\"T_bed89_row2_col1\" class=\"data row2 col1\" >0.0000</td>\n",
       "      <td id=\"T_bed89_row2_col2\" class=\"data row2 col2\" >0.8316</td>\n",
       "      <td id=\"T_bed89_row2_col3\" class=\"data row2 col3\" >0.2953</td>\n",
       "      <td id=\"T_bed89_row2_col4\" class=\"data row2 col4\" >2.0000</td>\n",
       "      <td id=\"T_bed89_row2_col5\" class=\"data row2 col5\" >0.3922</td>\n",
       "      <td id=\"T_bed89_row2_col6\" class=\"data row2 col6\" >Pclass</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_bed89_level0_row3\" class=\"row_heading level0 row3\" >2</th>\n",
       "      <td id=\"T_bed89_row3_col0\" class=\"data row3 col0\" >Quantitative('Age')</td>\n",
       "      <td id=\"T_bed89_row3_col1\" class=\"data row3 col1\" >0.0000</td>\n",
       "      <td id=\"T_bed89_row3_col2\" class=\"data row3 col2\" >0.9259</td>\n",
       "      <td id=\"T_bed89_row3_col3\" class=\"data row3 col3\" >0.1392</td>\n",
       "      <td id=\"T_bed89_row3_col4\" class=\"data row3 col4\" >3.0000</td>\n",
       "      <td id=\"T_bed89_row3_col5\" class=\"data row3 col5\" >0.1002</td>\n",
       "      <td id=\"T_bed89_row3_col6\" class=\"data row3 col6\" >Sex</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_bed89_level0_row4\" class=\"row_heading level0 row4\" >5</th>\n",
       "      <td id=\"T_bed89_row4_col0\" class=\"data row4 col0\" >Quantitative('Parents/Children Aboard')</td>\n",
       "      <td id=\"T_bed89_row4_col1\" class=\"data row4 col1\" >0.0000</td>\n",
       "      <td id=\"T_bed89_row4_col2\" class=\"data row4 col2\" >0.7374</td>\n",
       "      <td id=\"T_bed89_row4_col3\" class=\"data row4 col3\" >0.1364</td>\n",
       "      <td id=\"T_bed89_row4_col4\" class=\"data row4 col4\" >4.0000</td>\n",
       "      <td id=\"T_bed89_row4_col5\" class=\"data row4 col5\" >0.4666</td>\n",
       "      <td id=\"T_bed89_row4_col6\" class=\"data row4 col6\" >Age</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_bed89_level0_row5\" class=\"row_heading level0 row5\" >4</th>\n",
       "      <td id=\"T_bed89_row5_col0\" class=\"data row5 col0\" >Quantitative('Siblings/Spouses Aboard')</td>\n",
       "      <td id=\"T_bed89_row5_col1\" class=\"data row5 col1\" >0.0000</td>\n",
       "      <td id=\"T_bed89_row5_col2\" class=\"data row5 col2\" >0.6801</td>\n",
       "      <td id=\"T_bed89_row5_col3\" class=\"data row5 col3\" >0.1175</td>\n",
       "      <td id=\"T_bed89_row5_col4\" class=\"data row5 col4\" >5.0000</td>\n",
       "      <td id=\"T_bed89_row5_col5\" class=\"data row5 col5\" >0.4060</td>\n",
       "      <td id=\"T_bed89_row5_col6\" class=\"data row5 col6\" >Parents/Children Aboard</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "Features(['Sex', 'Pclass', 'Fare', 'Age'])"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from AutoCarver import ClassificationSelector\n",
    "\n",
    "# select the most target associated qualitative features\n",
    "feature_selector = ClassificationSelector(\n",
    "    features=features,\n",
    "    n_best_per_type=n_best_per_type,\n",
    "    verbose=True,  # displays statistics\n",
    ")\n",
    "best_features = feature_selector.select(train_set_processed, train_set_processed[target])\n",
    "best_features"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Sex</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Fare</th>\n",
       "      <th>Age</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>617</th>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>489</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>871</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>654</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>653</th>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Sex  Pclass  Fare  Age\n",
       "617    0       1   0.0  1.0\n",
       "489    0       0   0.0  1.0\n",
       "871    1       1   0.0  1.0\n",
       "654    1       1   0.0  1.0\n",
       "653    0       1   0.0  1.0"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train_set_processed[best_features].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Feature ``Sex`` is the most associated with the target ``Survived``:\n",
    "    - Tschuprow's T value is ``TschuprowtMeasure=0.5337``\n",
    "    - It has 0 % of NaNs (``NaNMeasure=0.0``) \n",
    "    - Its mode represents 64 % of observed data (``ModeMeasure=0.6364``)\n",
    "\n",
    "* Feature ``Fare`` is strongly associated to feature ``Pclass``:\n",
    "    - Tschuprow's T value is ``TschuprowtFilter=0.3922`` with ``TschuprowtWith=Pclass``\n",
    "\n",
    "* Here, no feature where filtered out for there inter-feature association or over-represented values (no thresholds were set)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Modeling\n",
    "Fitting model on train data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "ename": "AttributeError",
     "evalue": "'super' object has no attribute '__sklearn_tags__'",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mAttributeError\u001b[0m                            Traceback (most recent call last)",
      "File \u001b[1;32mc:\\Users\\defra\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\autocarver-i96ERKJw-py3.9\\lib\\site-packages\\IPython\\core\\formatters.py:974\u001b[0m, in \u001b[0;36mMimeBundleFormatter.__call__\u001b[1;34m(self, obj, include, exclude)\u001b[0m\n\u001b[0;32m    971\u001b[0m     method \u001b[38;5;241m=\u001b[39m get_real_method(obj, \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mprint_method)\n\u001b[0;32m    973\u001b[0m     \u001b[38;5;28;01mif\u001b[39;00m method \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m--> 974\u001b[0m         \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mmethod\u001b[49m\u001b[43m(\u001b[49m\u001b[43minclude\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43minclude\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mexclude\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mexclude\u001b[49m\u001b[43m)\u001b[49m\n\u001b[0;32m    975\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[0;32m    976\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n",
      "File \u001b[1;32mc:\\Users\\defra\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\autocarver-i96ERKJw-py3.9\\lib\\site-packages\\sklearn\\base.py:469\u001b[0m, in \u001b[0;36mBaseEstimator._repr_mimebundle_\u001b[1;34m(self, **kwargs)\u001b[0m\n\u001b[0;32m    467\u001b[0m output \u001b[38;5;241m=\u001b[39m {\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtext/plain\u001b[39m\u001b[38;5;124m\"\u001b[39m: \u001b[38;5;28mrepr\u001b[39m(\u001b[38;5;28mself\u001b[39m)}\n\u001b[0;32m    468\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m get_config()[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mdisplay\u001b[39m\u001b[38;5;124m\"\u001b[39m] \u001b[38;5;241m==\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mdiagram\u001b[39m\u001b[38;5;124m\"\u001b[39m:\n\u001b[1;32m--> 469\u001b[0m     output[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtext/html\u001b[39m\u001b[38;5;124m\"\u001b[39m] \u001b[38;5;241m=\u001b[39m \u001b[43mestimator_html_repr\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m)\u001b[49m\n\u001b[0;32m    470\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m output\n",
      "File \u001b[1;32mc:\\Users\\defra\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\autocarver-i96ERKJw-py3.9\\lib\\site-packages\\sklearn\\utils\\_estimator_html_repr.py:387\u001b[0m, in \u001b[0;36mestimator_html_repr\u001b[1;34m(estimator)\u001b[0m\n\u001b[0;32m    385\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m    386\u001b[0m     \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[1;32m--> 387\u001b[0m         \u001b[43mcheck_is_fitted\u001b[49m\u001b[43m(\u001b[49m\u001b[43mestimator\u001b[49m\u001b[43m)\u001b[49m\n\u001b[0;32m    388\u001b[0m         status_label \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m<span>Fitted</span>\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m    389\u001b[0m         is_fitted_css_class \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mfitted\u001b[39m\u001b[38;5;124m\"\u001b[39m\n",
      "File \u001b[1;32mc:\\Users\\defra\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\autocarver-i96ERKJw-py3.9\\lib\\site-packages\\sklearn\\utils\\validation.py:1751\u001b[0m, in \u001b[0;36mcheck_is_fitted\u001b[1;34m(estimator, attributes, msg, all_or_any)\u001b[0m\n\u001b[0;32m   1748\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28mhasattr\u001b[39m(estimator, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mfit\u001b[39m\u001b[38;5;124m\"\u001b[39m):\n\u001b[0;32m   1749\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;132;01m%s\u001b[39;00m\u001b[38;5;124m is not an estimator instance.\u001b[39m\u001b[38;5;124m\"\u001b[39m \u001b[38;5;241m%\u001b[39m (estimator))\n\u001b[1;32m-> 1751\u001b[0m tags \u001b[38;5;241m=\u001b[39m \u001b[43mget_tags\u001b[49m\u001b[43m(\u001b[49m\u001b[43mestimator\u001b[49m\u001b[43m)\u001b[49m\n\u001b[0;32m   1753\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m tags\u001b[38;5;241m.\u001b[39mrequires_fit \u001b[38;5;129;01mand\u001b[39;00m attributes \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[0;32m   1754\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m\n",
      "File \u001b[1;32mc:\\Users\\defra\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\autocarver-i96ERKJw-py3.9\\lib\\site-packages\\sklearn\\utils\\_tags.py:430\u001b[0m, in \u001b[0;36mget_tags\u001b[1;34m(estimator)\u001b[0m\n\u001b[0;32m    428\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m klass \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28mreversed\u001b[39m(\u001b[38;5;28mtype\u001b[39m(estimator)\u001b[38;5;241m.\u001b[39mmro()):\n\u001b[0;32m    429\u001b[0m     \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m__sklearn_tags__\u001b[39m\u001b[38;5;124m\"\u001b[39m \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28mvars\u001b[39m(klass):\n\u001b[1;32m--> 430\u001b[0m         sklearn_tags_provider[klass] \u001b[38;5;241m=\u001b[39m \u001b[43mklass\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m__sklearn_tags__\u001b[49m\u001b[43m(\u001b[49m\u001b[43mestimator\u001b[49m\u001b[43m)\u001b[49m  \u001b[38;5;66;03m# type: ignore[attr-defined]\u001b[39;00m\n\u001b[0;32m    431\u001b[0m         class_order\u001b[38;5;241m.\u001b[39mappend(klass)\n\u001b[0;32m    432\u001b[0m     \u001b[38;5;28;01melif\u001b[39;00m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m_more_tags\u001b[39m\u001b[38;5;124m\"\u001b[39m \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28mvars\u001b[39m(klass):\n",
      "File \u001b[1;32mc:\\Users\\defra\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\autocarver-i96ERKJw-py3.9\\lib\\site-packages\\sklearn\\base.py:540\u001b[0m, in \u001b[0;36mClassifierMixin.__sklearn_tags__\u001b[1;34m(self)\u001b[0m\n\u001b[0;32m    539\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[38;5;21m__sklearn_tags__\u001b[39m(\u001b[38;5;28mself\u001b[39m):\n\u001b[1;32m--> 540\u001b[0m     tags \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43msuper\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m__sklearn_tags__\u001b[49m()\n\u001b[0;32m    541\u001b[0m     tags\u001b[38;5;241m.\u001b[39mestimator_type \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mclassifier\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m    542\u001b[0m     tags\u001b[38;5;241m.\u001b[39mclassifier_tags \u001b[38;5;241m=\u001b[39m ClassifierTags()\n",
      "\u001b[1;31mAttributeError\u001b[0m: 'super' object has no attribute '__sklearn_tags__'"
     ]
    },
    {
     "ename": "AttributeError",
     "evalue": "'super' object has no attribute '__sklearn_tags__'",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mAttributeError\u001b[0m                            Traceback (most recent call last)",
      "File \u001b[1;32mc:\\Users\\defra\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\autocarver-i96ERKJw-py3.9\\lib\\site-packages\\IPython\\core\\formatters.py:344\u001b[0m, in \u001b[0;36mBaseFormatter.__call__\u001b[1;34m(self, obj)\u001b[0m\n\u001b[0;32m    342\u001b[0m     method \u001b[38;5;241m=\u001b[39m get_real_method(obj, \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mprint_method)\n\u001b[0;32m    343\u001b[0m     \u001b[38;5;28;01mif\u001b[39;00m method \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m--> 344\u001b[0m         \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mmethod\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n\u001b[0;32m    345\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[0;32m    346\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n",
      "File \u001b[1;32mc:\\Users\\defra\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\autocarver-i96ERKJw-py3.9\\lib\\site-packages\\sklearn\\base.py:463\u001b[0m, in \u001b[0;36mBaseEstimator._repr_html_inner\u001b[1;34m(self)\u001b[0m\n\u001b[0;32m    458\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[38;5;21m_repr_html_inner\u001b[39m(\u001b[38;5;28mself\u001b[39m):\n\u001b[0;32m    459\u001b[0m \u001b[38;5;250m    \u001b[39m\u001b[38;5;124;03m\"\"\"This function is returned by the @property `_repr_html_` to make\u001b[39;00m\n\u001b[0;32m    460\u001b[0m \u001b[38;5;124;03m    `hasattr(estimator, \"_repr_html_\") return `True` or `False` depending\u001b[39;00m\n\u001b[0;32m    461\u001b[0m \u001b[38;5;124;03m    on `get_config()[\"display\"]`.\u001b[39;00m\n\u001b[0;32m    462\u001b[0m \u001b[38;5;124;03m    \"\"\"\u001b[39;00m\n\u001b[1;32m--> 463\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mestimator_html_repr\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m)\u001b[49m\n",
      "File \u001b[1;32mc:\\Users\\defra\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\autocarver-i96ERKJw-py3.9\\lib\\site-packages\\sklearn\\utils\\_estimator_html_repr.py:387\u001b[0m, in \u001b[0;36mestimator_html_repr\u001b[1;34m(estimator)\u001b[0m\n\u001b[0;32m    385\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m    386\u001b[0m     \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[1;32m--> 387\u001b[0m         \u001b[43mcheck_is_fitted\u001b[49m\u001b[43m(\u001b[49m\u001b[43mestimator\u001b[49m\u001b[43m)\u001b[49m\n\u001b[0;32m    388\u001b[0m         status_label \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m<span>Fitted</span>\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m    389\u001b[0m         is_fitted_css_class \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mfitted\u001b[39m\u001b[38;5;124m\"\u001b[39m\n",
      "File \u001b[1;32mc:\\Users\\defra\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\autocarver-i96ERKJw-py3.9\\lib\\site-packages\\sklearn\\utils\\validation.py:1751\u001b[0m, in \u001b[0;36mcheck_is_fitted\u001b[1;34m(estimator, attributes, msg, all_or_any)\u001b[0m\n\u001b[0;32m   1748\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28mhasattr\u001b[39m(estimator, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mfit\u001b[39m\u001b[38;5;124m\"\u001b[39m):\n\u001b[0;32m   1749\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;132;01m%s\u001b[39;00m\u001b[38;5;124m is not an estimator instance.\u001b[39m\u001b[38;5;124m\"\u001b[39m \u001b[38;5;241m%\u001b[39m (estimator))\n\u001b[1;32m-> 1751\u001b[0m tags \u001b[38;5;241m=\u001b[39m \u001b[43mget_tags\u001b[49m\u001b[43m(\u001b[49m\u001b[43mestimator\u001b[49m\u001b[43m)\u001b[49m\n\u001b[0;32m   1753\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m tags\u001b[38;5;241m.\u001b[39mrequires_fit \u001b[38;5;129;01mand\u001b[39;00m attributes \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[0;32m   1754\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m\n",
      "File \u001b[1;32mc:\\Users\\defra\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\autocarver-i96ERKJw-py3.9\\lib\\site-packages\\sklearn\\utils\\_tags.py:430\u001b[0m, in \u001b[0;36mget_tags\u001b[1;34m(estimator)\u001b[0m\n\u001b[0;32m    428\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m klass \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28mreversed\u001b[39m(\u001b[38;5;28mtype\u001b[39m(estimator)\u001b[38;5;241m.\u001b[39mmro()):\n\u001b[0;32m    429\u001b[0m     \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m__sklearn_tags__\u001b[39m\u001b[38;5;124m\"\u001b[39m \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28mvars\u001b[39m(klass):\n\u001b[1;32m--> 430\u001b[0m         sklearn_tags_provider[klass] \u001b[38;5;241m=\u001b[39m \u001b[43mklass\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m__sklearn_tags__\u001b[49m\u001b[43m(\u001b[49m\u001b[43mestimator\u001b[49m\u001b[43m)\u001b[49m  \u001b[38;5;66;03m# type: ignore[attr-defined]\u001b[39;00m\n\u001b[0;32m    431\u001b[0m         class_order\u001b[38;5;241m.\u001b[39mappend(klass)\n\u001b[0;32m    432\u001b[0m     \u001b[38;5;28;01melif\u001b[39;00m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m_more_tags\u001b[39m\u001b[38;5;124m\"\u001b[39m \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28mvars\u001b[39m(klass):\n",
      "File \u001b[1;32mc:\\Users\\defra\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\autocarver-i96ERKJw-py3.9\\lib\\site-packages\\sklearn\\base.py:540\u001b[0m, in \u001b[0;36mClassifierMixin.__sklearn_tags__\u001b[1;34m(self)\u001b[0m\n\u001b[0;32m    539\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[38;5;21m__sklearn_tags__\u001b[39m(\u001b[38;5;28mself\u001b[39m):\n\u001b[1;32m--> 540\u001b[0m     tags \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43msuper\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m__sklearn_tags__\u001b[49m()\n\u001b[0;32m    541\u001b[0m     tags\u001b[38;5;241m.\u001b[39mestimator_type \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mclassifier\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m    542\u001b[0m     tags\u001b[38;5;241m.\u001b[39mclassifier_tags \u001b[38;5;241m=\u001b[39m ClassifierTags()\n",
      "\u001b[1;31mAttributeError\u001b[0m: 'super' object has no attribute '__sklearn_tags__'"
     ]
    },
    {
     "data": {
      "text/plain": [
       "XGBClassifier(base_score=None, booster=None, callbacks=None,\n",
       "              colsample_bylevel=None, colsample_bynode=None,\n",
       "              colsample_bytree=None, device=None, early_stopping_rounds=None,\n",
       "              enable_categorical=False, eval_metric=None, feature_types=None,\n",
       "              gamma=None, grow_policy=None, importance_type=None,\n",
       "              interaction_constraints=None, learning_rate=None, max_bin=None,\n",
       "              max_cat_threshold=None, max_cat_to_onehot=None,\n",
       "              max_delta_step=None, max_depth=None, max_leaves=None,\n",
       "              min_child_weight=None, missing=nan, monotone_constraints=None,\n",
       "              multi_strategy=None, n_estimators=None, n_jobs=None,\n",
       "              num_parallel_tree=None, random_state=None, ...)"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from xgboost import XGBClassifier\n",
    "\n",
    "model = XGBClassifier()\n",
    "model.fit(train_set_processed[best_features], train_set_processed[target])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Saving model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.save_model(\"binary_xgboost.json\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Prediction on dev dataset and performance"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "np.float64(0.8548426745329402)"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from sklearn.metrics import roc_auc_score\n",
    "\n",
    "dev_pred = model.predict_proba(dev_set_processed[best_features])[:, 1]\n",
    "roc_auc_score(dev_set_processed[target], dev_pred)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## What's next?\n",
    "\n",
    "* Thanks to **Carvers** all of your features are now optimally processed for your classification task!\n",
    "* As a final step towards your model, **Selectors** can prove to be handy tools to operate target optimal Data Pre-Selection, so make sure to check out [Selectors Examples](https://autocarver.readthedocs.io/en/latest/selectors_examples.html)!\n",
    "\n",
    "## Well done!\n",
    "\n",
    "Your commitment to achieving optimal results in binary classification tasks shines through in your meticulous use of **AutoCarver**'s ``BinaryCarver`` for data preprocessing. By fine-tuning and optimizing your dataset, you have set the stage for robust and accurate machine learning models.\n",
    "\n",
    "The ``BinaryCarver`` has proven to be a valuable ally in your pursuit of excellence, carving out a path toward enhanced feature representation and model interpretability. Your dedication to refining the data preprocessing steps reflects a commitment to extracting the maximum value from your datasets.\n",
    "\n",
    "We extend our sincere appreciation for choosing **AutoCarver** as your companion in the data preprocessing journey. Your use of **AutoCarver** demonstrates a dedication to leveraging cutting-edge tools for achieving excellence in binary classification tasks.\n",
    "\n",
    "As you transition to the modeling phase, may the carefully crafted features and preprocessing steps contribute to the success of your predictive models. We're excited to see the impact of your work and are grateful for the opportunity to be part of your data science endeavors.\n",
    "\n",
    "Thank you for trusting **AutoCarver**, and we wish you continued success in your data-driven ventures."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "autocarver-i96ERKJw-py3.9",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}