{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Ablation Studies\n",
"\n",
"The gold standard in building complex machine learning models is proving that each constituent part of the model contributes something to the proposed solution. \n",
"\n",
"Ablation studies play a pivotal role in this process by systematically dissecting machine learning models and evaluating the impact of individual components. By selectively removing or disabling specific features, layers, or modules within the model and observing the resulting changes in performance, we can assess the significance of each component in achieving the desired outcome. \n",
"\n",
"Ablation studies offer invaluable insights into the inner workings of complex models, shedding light on which elements are essential for model performance and which may be redundant or less influential. This rigorous approach not only validates the effectiveness of the model architecture but also provides guidance for model refinement and optimization, ultimately advancing the field of machine learning and enhancing the reproducibility and reliability of research findings.\n",
"\n",
"In this section, we’ll finally discuss how to present complex machine learning models in publications and ensure the viability of each part we engineered to solve our particular problem set. \n",
"\n",
"(Granted on a toy problem, so there's not too much we can ablate...)\n",
"\n",
"First we'll build a quick model, as we did in [the Data notebook](/notebooks/0-basic-data-prep-and-model.html)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"execution": {
"iopub.execute_input": "2022-12-13T01:42:25.558462Z",
"iopub.status.busy": "2022-12-13T01:42:25.558462Z",
"iopub.status.idle": "2022-12-13T01:42:25.570463Z",
"shell.execute_reply": "2022-12-13T01:42:25.569963Z"
}
},
"outputs": [],
"source": [
"import warnings\n",
"warnings.filterwarnings('ignore')"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "54158e1d",
"metadata": {
"execution": {
"iopub.execute_input": "2022-12-13T01:42:25.572964Z",
"iopub.status.busy": "2022-12-13T01:42:25.572964Z",
"iopub.status.idle": "2022-12-13T01:42:25.585465Z",
"shell.execute_reply": "2022-12-13T01:42:25.585465Z"
}
},
"outputs": [],
"source": [
"from pathlib import Path\n",
"\n",
"DATA_FOLDER = Path(\"..\", \"..\") / \"data\"\n",
"DATA_FILEPATH = DATA_FOLDER / \"penguins_clean.csv\""
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"execution": {
"iopub.execute_input": "2022-12-13T01:42:25.588466Z",
"iopub.status.busy": "2022-12-13T01:42:25.587966Z",
"iopub.status.idle": "2022-12-13T01:42:25.973559Z",
"shell.execute_reply": "2022-12-13T01:42:25.973059Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
"
],
"text/plain": [
" Average Deviation\n",
"Full 1.0 0.0"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"scoring = pd.DataFrame(columns=[\"Average\", \"Deviation\"])\n",
"scoring.loc[\"Full\", :] = [scores.mean(), scores.std()]\n",
"scoring"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's compare this to a model that doesn't scale the numeric inputs. \n",
"\n",
"Here using the pipelines comes in handy, because switching off certain components just changes those into a `noop`."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"execution": {
"iopub.execute_input": "2022-12-13T01:42:26.238801Z",
"iopub.status.busy": "2022-12-13T01:42:26.238801Z",
"iopub.status.idle": "2022-12-13T01:42:26.267291Z",
"shell.execute_reply": "2022-12-13T01:42:26.266791Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Average
\n",
"
Deviation
\n",
"
\n",
" \n",
" \n",
"
\n",
"
Full
\n",
"
1.0
\n",
"
0.0
\n",
"
\n",
"
\n",
"
No Standardisation
\n",
"
0.435455
\n",
"
0.045172
\n",
"
\n",
"
\n",
"
Single Column Sex
\n",
"
1.0
\n",
"
0.0
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Average Deviation\n",
"Full 1.0 0.0\n",
"No Standardisation 0.435455 0.045172\n",
"Single Column Sex 1.0 0.0"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# num_transformer = StandardScaler()\n",
"cat_transformer = OneHotEncoder(handle_unknown='ignore')\n",
"\n",
"preprocessor = ColumnTransformer(transformers=[\n",
" # ('num', num_transformer, num_features),\n",
" ('cat', cat_transformer, cat_features)\n",
"])\n",
"\n",
"\n",
"model2 = Pipeline(\n",
" steps=[\n",
" (\"preprocessor\", preprocessor),\n",
" (\"classifier\", SVC(random_state=42)),\n",
" ]\n",
")\n",
"\n",
"scores = cross_val_score(model2, X_test, y_test, cv=10)\n",
"\n",
"scoring.loc[\"No Standardisation\",:] = [scores.mean(), scores.std()]\n",
"scoring"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Those scores decrease significantly when we remove the standardisation step. This is because the SVM algorithm is sensitive to the scale of the features. The standardisation step is crucial to ensure that the model can learn from the data.\n",
"\n",
"Now we can evaluate if the model should use a singular column for `Sex`, since it is basically a binary column in our research data."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"execution": {
"iopub.execute_input": "2022-12-13T01:42:26.285110Z",
"iopub.status.busy": "2022-12-13T01:42:26.284610Z",
"iopub.status.idle": "2022-12-13T01:42:26.313615Z",
"shell.execute_reply": "2022-12-13T01:42:26.313115Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Average
\n",
"
Deviation
\n",
"
\n",
" \n",
" \n",
"
\n",
"
Full
\n",
"
1.0
\n",
"
0.0
\n",
"
\n",
"
\n",
"
No Standardisation
\n",
"
0.435455
\n",
"
0.045172
\n",
"
\n",
"
\n",
"
Single Column Sex
\n",
"
1.0
\n",
"
0.0
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Average Deviation\n",
"Full 1.0 0.0\n",
"No Standardisation 0.435455 0.045172\n",
"Single Column Sex 1.0 0.0"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\n",
"num_transformer = StandardScaler()\n",
"cat_transformer = OneHotEncoder(handle_unknown='ignore', drop='if_binary')\n",
"\n",
"preprocessor = ColumnTransformer(transformers=[\n",
" ('num', num_transformer, num_features),\n",
" ('cat', cat_transformer, cat_features)\n",
"])\n",
"\n",
"\n",
"model2 = Pipeline(steps=[\n",
" ('preprocessor', preprocessor),\n",
" ('classifier', SVC()),\n",
"])\n",
"\n",
"scores = cross_val_score(model2, X_test, y_test, cv=10)\n",
"\n",
"scoring.loc[\"Single Column Sex\",:] = [scores.mean(), scores.std()]\n",
"scoring"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This seems to not affect the model, so we can actually encode the catergorical information as binary in this case, instead of adding a feature column for `Male` and `Female`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Clearly this is a toy example and we knew that switching off standardisation would have this effect. In the real world, we would, however, switch off entire components of neural networks to evaluate the impact they have on our model performance. This strengthens the claims we make in our research significantly and usually leads to easier reviews. "
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.10.8 ('pydata-global-2022-ml-repro')",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.7"
},
"vscode": {
"interpreter": {
"hash": "d7369b48cea8bb1af6d88d25f2646d14ea11b68d7457d74f06fbf0d68480668d"
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}