{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Model Evaluation\n", "\n", "Applying machine learning in an applied science context is often method work. We build a prototype model and expect want to show that this method can be applied to our specific problem. This means that we have to guarantee that the insights we glean from this application generalize to new data from the same problem set.\n", "\n", "This is why we usually import `train_test_split()` from scikit-learn to get a validation set and a test set. But in my experience, in real-world applications, this isn’t always enough. In science, we usually deal with data that has some kind of correlation in some kind of dimension. Sometimes we have geospatial data and have to account for Tobler’s Law, i.e. things that are closer to each other matter more to each other than those data points at a larger distance. Sometimes we have temporal correlations, dealing with time series, where data points closer in time may influence each other.\n", "\n", "Not taking care of proper validation, will often lead to additional review cycles in a paper submission. It might lead to a rejection of the manuscript which is bad enough. In the worst case scenario, our research might report incorrect conclusions and have to be retracted. No one wants rejections or even retractions.\n", "\n", "So we’ll go into some methods to properly evaluate machine learning models even when our data is not “independent and identically distributed”." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2022-12-13T01:42:10.608399Z", "iopub.status.busy": "2022-12-13T01:42:10.607899Z", "iopub.status.idle": "2022-12-13T01:42:10.619565Z", "shell.execute_reply": "2022-12-13T01:42:10.619064Z" }, "tags": [] }, "outputs": [], "source": [ "from pathlib import Path\n", "\n", "DATA_FOLDER = Path(\"..\", \"..\") / \"data\"\n", "DATA_FILEPATH = DATA_FOLDER / \"penguins_clean.csv\"" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2022-12-13T01:42:10.622066Z", "iopub.status.busy": "2022-12-13T01:42:10.621565Z", "iopub.status.idle": "2022-12-13T01:42:11.022636Z", "shell.execute_reply": "2022-12-13T01:42:11.022136Z" }, "lines_to_next_cell": 2, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", " | Culmen Length (mm) | \n", "Culmen Depth (mm) | \n", "Flipper Length (mm) | \n", "Sex | \n", "Species | \n", "
---|---|---|---|---|---|
0 | \n", "39.1 | \n", "18.7 | \n", "181.0 | \n", "MALE | \n", "Adelie Penguin (Pygoscelis adeliae) | \n", "
1 | \n", "39.5 | \n", "17.4 | \n", "186.0 | \n", "FEMALE | \n", "Adelie Penguin (Pygoscelis adeliae) | \n", "
2 | \n", "40.3 | \n", "18.0 | \n", "195.0 | \n", "FEMALE | \n", "Adelie Penguin (Pygoscelis adeliae) | \n", "
3 | \n", "36.7 | \n", "19.3 | \n", "193.0 | \n", "FEMALE | \n", "Adelie Penguin (Pygoscelis adeliae) | \n", "
4 | \n", "39.3 | \n", "20.6 | \n", "190.0 | \n", "MALE | \n", "Adelie Penguin (Pygoscelis adeliae) | \n", "