{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Tremor analysis"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This tutorial shows how to run the tremor pipeline to obtain aggregated tremor measures from gyroscope sensor data. Before following along, make sure all data preparation steps have been followed in the data preparation tutorial. \n",
    "\n",
    "In this tutorial, we use two days of data from a participant of the Personalized Parkinson Project to demonstrate the functionalities. Since `ParaDigMa` expects contiguous time series, the collected data was stored in two segments each with contiguous timestamps. Per segment, we load the data and perform the following steps:\n",
    "1. Preprocess the time series data\n",
    "2. Extract tremor features\n",
    "3. Detect tremor\n",
    "4. Quantify tremor\n",
    "\n",
    "We then combine the output of the different segments for the final step:\n",
    "\n",
    "5. Compute aggregated tremor measures"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load example data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here, we start by loading a single contiguous time series (segment), for which we continue running steps 1-3. [Below](#multiple_segments_cell) we show how to run these steps for multiple segments.\n",
    "\n",
    "We use the interally developed `TSDF` ([documentation](https://biomarkersparkinson.github.io/tsdf/)) to load and store data [[1](https://arxiv.org/abs/2211.11294)]. Depending on the file extension of your time series data, examples of other Python functions for loading the data into memory include:\n",
    "- _.csv_: `pandas.read_csv()` ([documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html))\n",
    "- _.json_: `json.load()` ([documentation](https://docs.python.org/3/library/json.html#json.load))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>time</th>\n",
       "      <th>accelerometer_x</th>\n",
       "      <th>accelerometer_y</th>\n",
       "      <th>accelerometer_z</th>\n",
       "      <th>gyroscope_x</th>\n",
       "      <th>gyroscope_y</th>\n",
       "      <th>gyroscope_z</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.000000</td>\n",
       "      <td>-0.474641</td>\n",
       "      <td>-0.379426</td>\n",
       "      <td>0.770335</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.402439</td>\n",
       "      <td>0.243902</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.009933</td>\n",
       "      <td>-0.472727</td>\n",
       "      <td>-0.378947</td>\n",
       "      <td>0.765072</td>\n",
       "      <td>0.426829</td>\n",
       "      <td>0.670732</td>\n",
       "      <td>-0.121951</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.019867</td>\n",
       "      <td>-0.471770</td>\n",
       "      <td>-0.375598</td>\n",
       "      <td>0.766986</td>\n",
       "      <td>1.158537</td>\n",
       "      <td>-0.060976</td>\n",
       "      <td>-0.304878</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.029800</td>\n",
       "      <td>-0.472727</td>\n",
       "      <td>-0.375598</td>\n",
       "      <td>0.770335</td>\n",
       "      <td>1.158537</td>\n",
       "      <td>-0.548780</td>\n",
       "      <td>-0.548780</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.039733</td>\n",
       "      <td>-0.475120</td>\n",
       "      <td>-0.379426</td>\n",
       "      <td>0.772249</td>\n",
       "      <td>0.670732</td>\n",
       "      <td>-0.609756</td>\n",
       "      <td>-0.731707</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3455326</th>\n",
       "      <td>34339.561333</td>\n",
       "      <td>-0.257895</td>\n",
       "      <td>-0.319139</td>\n",
       "      <td>-0.761244</td>\n",
       "      <td>159.329269</td>\n",
       "      <td>14.634146</td>\n",
       "      <td>-28.658537</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3455327</th>\n",
       "      <td>34339.571267</td>\n",
       "      <td>-0.555502</td>\n",
       "      <td>-0.153110</td>\n",
       "      <td>-0.671292</td>\n",
       "      <td>125.060976</td>\n",
       "      <td>-213.902440</td>\n",
       "      <td>-19.329268</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3455328</th>\n",
       "      <td>34339.581200</td>\n",
       "      <td>-0.286124</td>\n",
       "      <td>-0.263636</td>\n",
       "      <td>-0.981340</td>\n",
       "      <td>158.658537</td>\n",
       "      <td>-328.170733</td>\n",
       "      <td>-3.170732</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3455329</th>\n",
       "      <td>34339.591133</td>\n",
       "      <td>-0.232536</td>\n",
       "      <td>-0.161722</td>\n",
       "      <td>-0.832536</td>\n",
       "      <td>288.841465</td>\n",
       "      <td>-281.707318</td>\n",
       "      <td>17.073171</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3455330</th>\n",
       "      <td>34339.601067</td>\n",
       "      <td>0.180383</td>\n",
       "      <td>-0.368421</td>\n",
       "      <td>-1.525837</td>\n",
       "      <td>376.219514</td>\n",
       "      <td>-140.853659</td>\n",
       "      <td>37.256098</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>3455331 rows × 7 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                 time  accelerometer_x  accelerometer_y  accelerometer_z  \\\n",
       "0            0.000000        -0.474641        -0.379426         0.770335   \n",
       "1            0.009933        -0.472727        -0.378947         0.765072   \n",
       "2            0.019867        -0.471770        -0.375598         0.766986   \n",
       "3            0.029800        -0.472727        -0.375598         0.770335   \n",
       "4            0.039733        -0.475120        -0.379426         0.772249   \n",
       "...               ...              ...              ...              ...   \n",
       "3455326  34339.561333        -0.257895        -0.319139        -0.761244   \n",
       "3455327  34339.571267        -0.555502        -0.153110        -0.671292   \n",
       "3455328  34339.581200        -0.286124        -0.263636        -0.981340   \n",
       "3455329  34339.591133        -0.232536        -0.161722        -0.832536   \n",
       "3455330  34339.601067         0.180383        -0.368421        -1.525837   \n",
       "\n",
       "         gyroscope_x  gyroscope_y  gyroscope_z  \n",
       "0           0.000000     1.402439     0.243902  \n",
       "1           0.426829     0.670732    -0.121951  \n",
       "2           1.158537    -0.060976    -0.304878  \n",
       "3           1.158537    -0.548780    -0.548780  \n",
       "4           0.670732    -0.609756    -0.731707  \n",
       "...              ...          ...          ...  \n",
       "3455326   159.329269    14.634146   -28.658537  \n",
       "3455327   125.060976  -213.902440   -19.329268  \n",
       "3455328   158.658537  -328.170733    -3.170732  \n",
       "3455329   288.841465  -281.707318    17.073171  \n",
       "3455330   376.219514  -140.853659    37.256098  \n",
       "\n",
       "[3455331 rows x 7 columns]"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from pathlib import Path\n",
    "from paradigma.util import load_tsdf_dataframe\n",
    "\n",
    "# Set the path to where the prepared data is saved and load the data.\n",
    "# Note: the test data is stored in TSDF, but you can load your data in your own way\n",
    "path_to_data =  Path('../../example_data')\n",
    "path_to_prepared_data = path_to_data / 'imu'\n",
    "\n",
    "segment_nr  = '0001' \n",
    "\n",
    "df_data, metadata_time, metadata_values = load_tsdf_dataframe(path_to_prepared_data, prefix=f'IMU_segment{segment_nr}')\n",
    "\n",
    "df_data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 1: Preprocess data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "IMU sensors collect data at a fixed sampling frequency, but the sampling rate is not uniform, causing variation in time differences between timestamps. The [preprocess_imu_data](https://github.com/biomarkersParkinson/paradigma/blob/main/src/paradigma/preprocessing.py#:~:text=preprocess_imu_data) function therefore resamples the timestamps to be uniformly distributed, and then interpolates IMU values at these new timestamps using the original timestamps and corresponding IMU values. By setting `sensor` to 'gyroscope', only gyroscope data is preprocessed and the accelerometer data is removed from the dataframe. Also a `watch_side` should be provided, although for the tremor analysis it does not matter whether this is the correct side since the tremor features are not influenced by the gyroscope axes orientation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The data is resampled to 100 Hz.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>time</th>\n",
       "      <th>gyroscope_x</th>\n",
       "      <th>gyroscope_y</th>\n",
       "      <th>gyroscope_z</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.00</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.402439</td>\n",
       "      <td>0.243902</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.01</td>\n",
       "      <td>0.432231</td>\n",
       "      <td>0.665526</td>\n",
       "      <td>-0.123434</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.02</td>\n",
       "      <td>1.164277</td>\n",
       "      <td>-0.069584</td>\n",
       "      <td>-0.307536</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.03</td>\n",
       "      <td>1.151432</td>\n",
       "      <td>-0.554928</td>\n",
       "      <td>-0.554223</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.04</td>\n",
       "      <td>0.657189</td>\n",
       "      <td>-0.603207</td>\n",
       "      <td>-0.731570</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3433956</th>\n",
       "      <td>34339.56</td>\n",
       "      <td>130.392434</td>\n",
       "      <td>29.491627</td>\n",
       "      <td>-26.868202</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3433957</th>\n",
       "      <td>34339.57</td>\n",
       "      <td>135.771133</td>\n",
       "      <td>-184.515525</td>\n",
       "      <td>-21.544211</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3433958</th>\n",
       "      <td>34339.58</td>\n",
       "      <td>146.364103</td>\n",
       "      <td>-324.248909</td>\n",
       "      <td>-5.248641</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3433959</th>\n",
       "      <td>34339.59</td>\n",
       "      <td>273.675024</td>\n",
       "      <td>-293.011330</td>\n",
       "      <td>14.618256</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3433960</th>\n",
       "      <td>34339.60</td>\n",
       "      <td>372.878731</td>\n",
       "      <td>-158.516265</td>\n",
       "      <td>35.330770</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>3433961 rows × 4 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "             time  gyroscope_x  gyroscope_y  gyroscope_z\n",
       "0            0.00     0.000000     1.402439     0.243902\n",
       "1            0.01     0.432231     0.665526    -0.123434\n",
       "2            0.02     1.164277    -0.069584    -0.307536\n",
       "3            0.03     1.151432    -0.554928    -0.554223\n",
       "4            0.04     0.657189    -0.603207    -0.731570\n",
       "...           ...          ...          ...          ...\n",
       "3433956  34339.56   130.392434    29.491627   -26.868202\n",
       "3433957  34339.57   135.771133  -184.515525   -21.544211\n",
       "3433958  34339.58   146.364103  -324.248909    -5.248641\n",
       "3433959  34339.59   273.675024  -293.011330    14.618256\n",
       "3433960  34339.60   372.878731  -158.516265    35.330770\n",
       "\n",
       "[3433961 rows x 4 columns]"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from paradigma.config import IMUConfig\n",
    "from paradigma.preprocessing import preprocess_imu_data\n",
    "\n",
    "config = IMUConfig()\n",
    "print(f'The data is resampled to {config.sampling_frequency} Hz.')\n",
    "\n",
    "df_preprocessed_data = preprocess_imu_data(df_data, config, sensor='gyroscope', watch_side='left')\n",
    "\n",
    "df_preprocessed_data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 2: Extract tremor features"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The function [`extract_tremor_features`](https://github.com/biomarkersParkinson/paradigma/blob/main/src/paradigma/pipelines/tremor_pipeline.py#:~:text=extract_tremor_features) extracts windows from the preprocessed gyroscope data using non-overlapping windows of length `config.window_length_s`. Next, from these windows the tremor features are extracted: 12 mel-frequency cepstral coefficients (MFCCs), frequency of the peak in the power spectral density, power below tremor (0.5 - 3 Hz), and power around the tremor peak. The latter is not used for tremor detection, but stored for tremor quantification in Step 4."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The window length is 4 seconds\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>time</th>\n",
       "      <th>mfcc_1</th>\n",
       "      <th>mfcc_2</th>\n",
       "      <th>mfcc_3</th>\n",
       "      <th>mfcc_4</th>\n",
       "      <th>mfcc_5</th>\n",
       "      <th>mfcc_6</th>\n",
       "      <th>mfcc_7</th>\n",
       "      <th>mfcc_8</th>\n",
       "      <th>mfcc_9</th>\n",
       "      <th>mfcc_10</th>\n",
       "      <th>mfcc_11</th>\n",
       "      <th>mfcc_12</th>\n",
       "      <th>freq_peak</th>\n",
       "      <th>below_tremor_power</th>\n",
       "      <th>tremor_power</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.0</td>\n",
       "      <td>5.323582</td>\n",
       "      <td>1.179579</td>\n",
       "      <td>-0.498552</td>\n",
       "      <td>-0.149152</td>\n",
       "      <td>-0.063535</td>\n",
       "      <td>-0.132090</td>\n",
       "      <td>-0.112380</td>\n",
       "      <td>-0.044326</td>\n",
       "      <td>-0.025917</td>\n",
       "      <td>0.116045</td>\n",
       "      <td>0.169869</td>\n",
       "      <td>0.213884</td>\n",
       "      <td>3.75</td>\n",
       "      <td>0.082219</td>\n",
       "      <td>0.471588</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>4.0</td>\n",
       "      <td>5.333162</td>\n",
       "      <td>1.205712</td>\n",
       "      <td>-0.607844</td>\n",
       "      <td>-0.138371</td>\n",
       "      <td>-0.039518</td>\n",
       "      <td>-0.137703</td>\n",
       "      <td>-0.069552</td>\n",
       "      <td>-0.008029</td>\n",
       "      <td>-0.087711</td>\n",
       "      <td>0.089844</td>\n",
       "      <td>0.152380</td>\n",
       "      <td>0.195165</td>\n",
       "      <td>3.75</td>\n",
       "      <td>0.071260</td>\n",
       "      <td>0.327252</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>8.0</td>\n",
       "      <td>5.180974</td>\n",
       "      <td>1.039548</td>\n",
       "      <td>-0.627100</td>\n",
       "      <td>-0.054816</td>\n",
       "      <td>-0.016767</td>\n",
       "      <td>-0.044817</td>\n",
       "      <td>0.079859</td>\n",
       "      <td>-0.023155</td>\n",
       "      <td>0.024729</td>\n",
       "      <td>0.104989</td>\n",
       "      <td>0.126502</td>\n",
       "      <td>0.192319</td>\n",
       "      <td>7.75</td>\n",
       "      <td>0.097961</td>\n",
       "      <td>0.114138</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>12.0</td>\n",
       "      <td>5.290298</td>\n",
       "      <td>1.183957</td>\n",
       "      <td>-0.627651</td>\n",
       "      <td>-0.027235</td>\n",
       "      <td>0.095184</td>\n",
       "      <td>-0.050455</td>\n",
       "      <td>-0.024654</td>\n",
       "      <td>0.029754</td>\n",
       "      <td>-0.007459</td>\n",
       "      <td>0.125700</td>\n",
       "      <td>0.146895</td>\n",
       "      <td>0.220589</td>\n",
       "      <td>7.75</td>\n",
       "      <td>0.193237</td>\n",
       "      <td>0.180988</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>16.0</td>\n",
       "      <td>5.128074</td>\n",
       "      <td>1.066869</td>\n",
       "      <td>-0.622282</td>\n",
       "      <td>0.038557</td>\n",
       "      <td>-0.034719</td>\n",
       "      <td>0.045109</td>\n",
       "      <td>0.076679</td>\n",
       "      <td>0.057267</td>\n",
       "      <td>-0.024619</td>\n",
       "      <td>0.131755</td>\n",
       "      <td>0.177849</td>\n",
       "      <td>0.149686</td>\n",
       "      <td>7.75</td>\n",
       "      <td>0.156469</td>\n",
       "      <td>0.090009</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8579</th>\n",
       "      <td>34316.0</td>\n",
       "      <td>7.071408</td>\n",
       "      <td>-0.376556</td>\n",
       "      <td>0.272322</td>\n",
       "      <td>0.068750</td>\n",
       "      <td>0.051588</td>\n",
       "      <td>0.102012</td>\n",
       "      <td>0.055017</td>\n",
       "      <td>0.115942</td>\n",
       "      <td>0.012746</td>\n",
       "      <td>0.117970</td>\n",
       "      <td>0.073279</td>\n",
       "      <td>0.057367</td>\n",
       "      <td>13.50</td>\n",
       "      <td>48.930380</td>\n",
       "      <td>91.971686</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8580</th>\n",
       "      <td>34320.0</td>\n",
       "      <td>1.917642</td>\n",
       "      <td>0.307927</td>\n",
       "      <td>0.142330</td>\n",
       "      <td>0.265357</td>\n",
       "      <td>0.285635</td>\n",
       "      <td>0.143886</td>\n",
       "      <td>0.259636</td>\n",
       "      <td>0.195724</td>\n",
       "      <td>0.176947</td>\n",
       "      <td>0.162205</td>\n",
       "      <td>0.147897</td>\n",
       "      <td>0.170488</td>\n",
       "      <td>11.00</td>\n",
       "      <td>0.012123</td>\n",
       "      <td>0.000316</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8581</th>\n",
       "      <td>34324.0</td>\n",
       "      <td>2.383806</td>\n",
       "      <td>0.268580</td>\n",
       "      <td>0.151254</td>\n",
       "      <td>0.414430</td>\n",
       "      <td>0.241540</td>\n",
       "      <td>0.244071</td>\n",
       "      <td>0.201109</td>\n",
       "      <td>0.209611</td>\n",
       "      <td>0.097146</td>\n",
       "      <td>0.048798</td>\n",
       "      <td>0.013239</td>\n",
       "      <td>0.035379</td>\n",
       "      <td>2.00</td>\n",
       "      <td>0.013077</td>\n",
       "      <td>0.000615</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8582</th>\n",
       "      <td>34328.0</td>\n",
       "      <td>1.883626</td>\n",
       "      <td>0.089983</td>\n",
       "      <td>0.196880</td>\n",
       "      <td>0.300523</td>\n",
       "      <td>0.239185</td>\n",
       "      <td>0.259342</td>\n",
       "      <td>0.277586</td>\n",
       "      <td>0.206517</td>\n",
       "      <td>0.178499</td>\n",
       "      <td>0.215561</td>\n",
       "      <td>0.067234</td>\n",
       "      <td>0.123958</td>\n",
       "      <td>13.75</td>\n",
       "      <td>0.011466</td>\n",
       "      <td>0.000211</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8583</th>\n",
       "      <td>34332.0</td>\n",
       "      <td>2.599103</td>\n",
       "      <td>0.286252</td>\n",
       "      <td>-0.014529</td>\n",
       "      <td>0.475488</td>\n",
       "      <td>0.229446</td>\n",
       "      <td>0.188200</td>\n",
       "      <td>0.173689</td>\n",
       "      <td>0.033262</td>\n",
       "      <td>0.138957</td>\n",
       "      <td>0.106176</td>\n",
       "      <td>0.036859</td>\n",
       "      <td>0.082178</td>\n",
       "      <td>12.50</td>\n",
       "      <td>0.015068</td>\n",
       "      <td>0.000891</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>8584 rows × 16 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "         time    mfcc_1    mfcc_2    mfcc_3    mfcc_4    mfcc_5    mfcc_6  \\\n",
       "0         0.0  5.323582  1.179579 -0.498552 -0.149152 -0.063535 -0.132090   \n",
       "1         4.0  5.333162  1.205712 -0.607844 -0.138371 -0.039518 -0.137703   \n",
       "2         8.0  5.180974  1.039548 -0.627100 -0.054816 -0.016767 -0.044817   \n",
       "3        12.0  5.290298  1.183957 -0.627651 -0.027235  0.095184 -0.050455   \n",
       "4        16.0  5.128074  1.066869 -0.622282  0.038557 -0.034719  0.045109   \n",
       "...       ...       ...       ...       ...       ...       ...       ...   \n",
       "8579  34316.0  7.071408 -0.376556  0.272322  0.068750  0.051588  0.102012   \n",
       "8580  34320.0  1.917642  0.307927  0.142330  0.265357  0.285635  0.143886   \n",
       "8581  34324.0  2.383806  0.268580  0.151254  0.414430  0.241540  0.244071   \n",
       "8582  34328.0  1.883626  0.089983  0.196880  0.300523  0.239185  0.259342   \n",
       "8583  34332.0  2.599103  0.286252 -0.014529  0.475488  0.229446  0.188200   \n",
       "\n",
       "        mfcc_7    mfcc_8    mfcc_9   mfcc_10   mfcc_11   mfcc_12  freq_peak  \\\n",
       "0    -0.112380 -0.044326 -0.025917  0.116045  0.169869  0.213884       3.75   \n",
       "1    -0.069552 -0.008029 -0.087711  0.089844  0.152380  0.195165       3.75   \n",
       "2     0.079859 -0.023155  0.024729  0.104989  0.126502  0.192319       7.75   \n",
       "3    -0.024654  0.029754 -0.007459  0.125700  0.146895  0.220589       7.75   \n",
       "4     0.076679  0.057267 -0.024619  0.131755  0.177849  0.149686       7.75   \n",
       "...        ...       ...       ...       ...       ...       ...        ...   \n",
       "8579  0.055017  0.115942  0.012746  0.117970  0.073279  0.057367      13.50   \n",
       "8580  0.259636  0.195724  0.176947  0.162205  0.147897  0.170488      11.00   \n",
       "8581  0.201109  0.209611  0.097146  0.048798  0.013239  0.035379       2.00   \n",
       "8582  0.277586  0.206517  0.178499  0.215561  0.067234  0.123958      13.75   \n",
       "8583  0.173689  0.033262  0.138957  0.106176  0.036859  0.082178      12.50   \n",
       "\n",
       "      below_tremor_power  tremor_power  \n",
       "0               0.082219      0.471588  \n",
       "1               0.071260      0.327252  \n",
       "2               0.097961      0.114138  \n",
       "3               0.193237      0.180988  \n",
       "4               0.156469      0.090009  \n",
       "...                  ...           ...  \n",
       "8579           48.930380     91.971686  \n",
       "8580            0.012123      0.000316  \n",
       "8581            0.013077      0.000615  \n",
       "8582            0.011466      0.000211  \n",
       "8583            0.015068      0.000891  \n",
       "\n",
       "[8584 rows x 16 columns]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from paradigma.config import TremorConfig\n",
    "from paradigma.pipelines.tremor_pipeline import extract_tremor_features\n",
    "\n",
    "config = TremorConfig(step='features')\n",
    "print(f'The window length is {config.window_length_s} seconds')\n",
    "\n",
    "df_features = extract_tremor_features(df_preprocessed_data, config)\n",
    "\n",
    "df_features"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 3: Detect tremor"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The function [`detect_tremor`](https://github.com/biomarkersParkinson/paradigma/blob/main/src/paradigma/pipelines/tremor_pipeline.py#:~:text=detect_tremor) uses a pretrained logistic regression classifier to predict the tremor probability (`pred_tremor_proba`) for each window, based on the MFCCs. Using the prespecified threshold, a tremor label of 0 (no tremor) or 1 (tremor) is assigned (`pred_tremor_logreg`). Furthermore, the detected tremor windows are checked for rest tremor in two ways. First, the frequency of the peak should be between 3-7 Hz. Second, we want to exclude windows with significant arm movements. We consider a window to have significant arm movement if `below_tremor_power` exceeds `config.movement_threshold`. The final tremor label is saved in `pred_tremor_checked`. A label for predicted arm at rest (`pred_arm_at_rest`, which is 1 when at rest and 0 when not at rest) was also saved, to control for the amount of arm movement during the observed time period when aggregating the amount of tremor time in Step 4 (if a person is moving their arm, they cannot have rest tremor)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "A threshold of 50 deg²/s² is used to determine whether the arm is at rest or in stable posture.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>time</th>\n",
       "      <th>pred_tremor_proba</th>\n",
       "      <th>pred_tremor_logreg</th>\n",
       "      <th>pred_arm_at_rest</th>\n",
       "      <th>pred_tremor_checked</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.038968</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>4.0</td>\n",
       "      <td>0.035365</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>8.0</td>\n",
       "      <td>0.031255</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>12.0</td>\n",
       "      <td>0.021106</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>16.0</td>\n",
       "      <td>0.021078</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8579</th>\n",
       "      <td>34316.0</td>\n",
       "      <td>0.000296</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8580</th>\n",
       "      <td>34320.0</td>\n",
       "      <td>0.000089</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8581</th>\n",
       "      <td>34324.0</td>\n",
       "      <td>0.000023</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8582</th>\n",
       "      <td>34328.0</td>\n",
       "      <td>0.000053</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8583</th>\n",
       "      <td>34332.0</td>\n",
       "      <td>0.000049</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>8584 rows × 5 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "         time  pred_tremor_proba  pred_tremor_logreg  pred_arm_at_rest  \\\n",
       "0         0.0           0.038968                   1                 1   \n",
       "1         4.0           0.035365                   1                 1   \n",
       "2         8.0           0.031255                   1                 1   \n",
       "3        12.0           0.021106                   0                 1   \n",
       "4        16.0           0.021078                   0                 1   \n",
       "...       ...                ...                 ...               ...   \n",
       "8579  34316.0           0.000296                   0                 1   \n",
       "8580  34320.0           0.000089                   0                 1   \n",
       "8581  34324.0           0.000023                   0                 1   \n",
       "8582  34328.0           0.000053                   0                 1   \n",
       "8583  34332.0           0.000049                   0                 1   \n",
       "\n",
       "      pred_tremor_checked  \n",
       "0                       1  \n",
       "1                       1  \n",
       "2                       0  \n",
       "3                       0  \n",
       "4                       0  \n",
       "...                   ...  \n",
       "8579                    0  \n",
       "8580                    0  \n",
       "8581                    0  \n",
       "8582                    0  \n",
       "8583                    0  \n",
       "\n",
       "[8584 rows x 5 columns]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from importlib.resources import files\n",
    "from paradigma.pipelines.tremor_pipeline import detect_tremor\n",
    "\n",
    "print(f'A threshold of {config.movement_threshold} deg\\u00b2/s\\u00b2 \\\n",
    "is used to determine whether the arm is at rest or in stable posture.')\n",
    "\n",
    "# Load the pre-trained logistic regression classifier\n",
    "tremor_detection_classifier_package_filename = 'tremor_detection_clf_package.pkl'\n",
    "full_path_to_classifier_package = files('paradigma') / 'assets' / tremor_detection_classifier_package_filename\n",
    "\n",
    "# Use the logistic regression classifier to detect tremor and check for rest tremor\n",
    "df_predictions = detect_tremor(df_features, config, full_path_to_classifier_package)\n",
    "\n",
    "df_predictions[['time', 'pred_tremor_proba', 'pred_tremor_logreg', 'pred_arm_at_rest', 'pred_tremor_checked']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Store as TSDF\n",
    "The predicted probabilities (and optionally other features) can be stored and loaded in TSDF as demonstrated below. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tsdf\n",
    "from paradigma.util import write_df_data\n",
    "\n",
    "# Set 'path_to_data' to the directory where you want to save the data\n",
    "metadata_time_store = tsdf.TSDFMetadata(metadata_time.get_plain_tsdf_dict_copy(), path_to_data)\n",
    "metadata_values_store = tsdf.TSDFMetadata(metadata_values.get_plain_tsdf_dict_copy(), path_to_data)\n",
    "\n",
    "# Select the columns to be saved \n",
    "metadata_time_store.channels = ['time']\n",
    "metadata_values_store.channels = ['pred_tremor_proba', 'pred_tremor_logreg', 'pred_arm_at_rest', 'pred_tremor_checked']\n",
    "\n",
    "# Set the units\n",
    "metadata_time_store.units = ['Relative seconds']\n",
    "metadata_values_store.units = ['Unitless', 'Unitless', 'Unitless', 'Unitless']  \n",
    "metadata_time_store.data_type = float\n",
    "metadata_values_store.data_type = float\n",
    "\n",
    "# Set the filenames\n",
    "meta_store_filename = f'segment{segment_nr}_meta.json'\n",
    "values_store_filename = meta_store_filename.replace('_meta.json', '_values.bin')\n",
    "time_store_filename = meta_store_filename.replace('_meta.json', '_time.bin')\n",
    "\n",
    "metadata_values_store.file_name = values_store_filename\n",
    "metadata_time_store.file_name = time_store_filename\n",
    "\n",
    "write_df_data(metadata_time_store, metadata_values_store, path_to_data, meta_store_filename, df_predictions)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_predictions, _, _ = load_tsdf_dataframe(path_to_data, prefix=f'segment{segment_nr}')\n",
    "df_predictions.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 4: Quantify tremor"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The tremor power of all predicted tremor windows (where `pred_tremor_checked` is 1) is used for tremor quantification. A datetime column is also added, providing necessary information before aggregating over specified hours in Step 5."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>time</th>\n",
       "      <th>time_dt</th>\n",
       "      <th>pred_arm_at_rest</th>\n",
       "      <th>pred_tremor_checked</th>\n",
       "      <th>tremor_power</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.0</td>\n",
       "      <td>2019-08-20 12:39:16+02:00</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0.471588</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>4.0</td>\n",
       "      <td>2019-08-20 12:39:20+02:00</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0.327252</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>8.0</td>\n",
       "      <td>2019-08-20 12:39:24+02:00</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>12.0</td>\n",
       "      <td>2019-08-20 12:39:28+02:00</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>16.0</td>\n",
       "      <td>2019-08-20 12:39:32+02:00</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8579</th>\n",
       "      <td>34316.0</td>\n",
       "      <td>2019-08-20 22:11:12+02:00</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8580</th>\n",
       "      <td>34320.0</td>\n",
       "      <td>2019-08-20 22:11:16+02:00</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8581</th>\n",
       "      <td>34324.0</td>\n",
       "      <td>2019-08-20 22:11:20+02:00</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8582</th>\n",
       "      <td>34328.0</td>\n",
       "      <td>2019-08-20 22:11:24+02:00</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8583</th>\n",
       "      <td>34332.0</td>\n",
       "      <td>2019-08-20 22:11:28+02:00</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>8584 rows × 5 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "         time                   time_dt  pred_arm_at_rest  \\\n",
       "0         0.0 2019-08-20 12:39:16+02:00                 1   \n",
       "1         4.0 2019-08-20 12:39:20+02:00                 1   \n",
       "2         8.0 2019-08-20 12:39:24+02:00                 1   \n",
       "3        12.0 2019-08-20 12:39:28+02:00                 1   \n",
       "4        16.0 2019-08-20 12:39:32+02:00                 1   \n",
       "...       ...                       ...               ...   \n",
       "8579  34316.0 2019-08-20 22:11:12+02:00                 1   \n",
       "8580  34320.0 2019-08-20 22:11:16+02:00                 1   \n",
       "8581  34324.0 2019-08-20 22:11:20+02:00                 1   \n",
       "8582  34328.0 2019-08-20 22:11:24+02:00                 1   \n",
       "8583  34332.0 2019-08-20 22:11:28+02:00                 1   \n",
       "\n",
       "      pred_tremor_checked  tremor_power  \n",
       "0                       1      0.471588  \n",
       "1                       1      0.327252  \n",
       "2                       0           NaN  \n",
       "3                       0           NaN  \n",
       "4                       0           NaN  \n",
       "...                   ...           ...  \n",
       "8579                    0           NaN  \n",
       "8580                    0           NaN  \n",
       "8581                    0           NaN  \n",
       "8582                    0           NaN  \n",
       "8583                    0           NaN  \n",
       "\n",
       "[8584 rows x 5 columns]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "import datetime\n",
    "import pytz\n",
    "\n",
    "df_quantification = df_predictions[['time', 'pred_arm_at_rest', 'pred_tremor_checked','tremor_power']].copy()\n",
    "df_quantification.loc[df_predictions['pred_tremor_checked'] == 0, 'tremor_power'] = None # tremor power of non-tremor windows is set to None\n",
    "\n",
    "# Create datetime column based on the start time of the segment\n",
    "start_time = datetime.datetime.strptime(metadata_time.start_iso8601, '%Y-%m-%dT%H:%M:%SZ')\n",
    "start_time = start_time.replace(tzinfo=pytz.timezone('UTC')).astimezone(pytz.timezone('CET')) # convert to correct timezone if necessary\n",
    "df_quantification['time_dt'] = start_time + pd.to_timedelta(df_quantification['time'], unit=\"s\") \n",
    "df_quantification = df_quantification[['time', 'time_dt', 'pred_arm_at_rest', 'pred_tremor_checked', 'tremor_power']]\n",
    "\n",
    "df_quantification"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Run steps 1 - 4 for all segments <a id='multiple_segments_cell'></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If your data is also stored in multiple segments, you can modify `segments` in the cell below to a list of the filenames of your respective segmented data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "from importlib.resources import files\n",
    "\n",
    "from paradigma.util import load_tsdf_dataframe\n",
    "from paradigma.config import IMUConfig, TremorConfig\n",
    "from paradigma.preprocessing import preprocess_imu_data\n",
    "from paradigma.pipelines.tremor_pipeline import extract_tremor_features, detect_tremor\n",
    "\n",
    "# Set the path to where the prepared data is saved\n",
    "path_to_data =  Path('../../example_data')\n",
    "path_to_prepared_data = path_to_data / 'imu'\n",
    "\n",
    "# Load the pre-trained logistic regression classifier\n",
    "tremor_detection_classifier_package_filename = 'tremor_detection_clf_package.pkl'\n",
    "full_path_to_classifier_package = files('paradigma') / 'assets' / tremor_detection_classifier_package_filename\n",
    "\n",
    "# Create a list of dataframes to store the quantifications of all segments\n",
    "list_df_quantifications = []\n",
    "\n",
    "segments  = ['0001','0002'] # list with all  available segments\n",
    "\n",
    "for segment_nr in segments:\n",
    "    \n",
    "    # Load the data\n",
    "    df_data, metadata_time, _ = load_tsdf_dataframe(path_to_prepared_data, prefix='IMU_segment'+segment_nr)\n",
    "\n",
    "    # 1: Preprocess the data\n",
    "    config = IMUConfig()\n",
    "    df_preprocessed_data = preprocess_imu_data(df_data, config, sensor='gyroscope', watch_side='left')\n",
    "\n",
    "    # 2: Extract features\n",
    "    config = TremorConfig(step='features')\n",
    "    df_features = extract_tremor_features(df_preprocessed_data, config)\n",
    "\n",
    "    # 3: Detect tremor\n",
    "    df_predictions = detect_tremor(df_features, config, full_path_to_classifier_package)\n",
    "\n",
    "    # 4: Quantify tremor\n",
    "    df_quantification = df_predictions[['time', 'pred_arm_at_rest', 'pred_tremor_checked','tremor_power']].copy()\n",
    "    df_quantification.loc[df_predictions['pred_tremor_checked'] == 0, 'tremor_power'] = None\n",
    "\n",
    "    # Create datetime column based on the start time of the segment\n",
    "    start_time = datetime.datetime.strptime(metadata_time.start_iso8601, '%Y-%m-%dT%H:%M:%SZ')\n",
    "    start_time = start_time.replace(tzinfo=pytz.timezone('UTC')).astimezone(pytz.timezone('CET')) # convert to correct timezone if necessary\n",
    "    df_quantification['time_dt'] = start_time + pd.to_timedelta(df_quantification['time'], unit=\"s\") \n",
    "    df_quantification = df_quantification[['time', 'time_dt', 'pred_arm_at_rest', 'pred_tremor_checked', 'tremor_power']]\n",
    "\n",
    "    # Add the quantifications of the current segment to the list\n",
    "    df_quantification['segment_nr'] = segment_nr\n",
    "    list_df_quantifications.append(df_quantification)\n",
    "\n",
    "df_quantification = pd.concat(list_df_quantifications, ignore_index=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 5: Compute aggregated tremor measures"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The final step is to compute the amount of tremor time and tremor power with the function [`aggregate_tremor`](https://github.com/biomarkersParkinson/paradigma/blob/main/src/paradigma/pipelines/tremor_pipeline.py#:~:text=aggregate_tremor), which aggregates over all windows in the input dataframe. Depending on the size of the input dateframe, you could select the hours and days (both optional) that you want to include in this analysis. In this case we use data collected between 8 am and 10 pm (specified as `select_hours_start` and `select_hours_end`), and days with at least 10 hours of data (`min_hours_per_day`) based on. Based on the selected data, we compute aggregated measures for tremor time and tremor power:\n",
    "- Tremor time is calculated as the number of detected tremor windows, as percentage of the number of windows while the arm is at rest or in stable posture (when `below_tremor_power` does not exceed `config.movement_threshold`). This way the tremor time is controlled for the amount of time the arm is at rest or in stable posture, when rest tremor and re-emergent tremor could occur.\n",
    "- For tremor power the following aggregates are derived: the mode, median and 90th percentile of tremor power (specified in `config.aggregates_tremor_power`). The median and modal tremor power reflect the typical tremor severity, whereas the 90th percentile reflects the maximal tremor severity within the observed timeframe. The aggregated tremor measures and metadata are stored in a json file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Before aggregation we select data collected between 08:00 and 22:00. We also select days with at least 10 hours of data.\n",
      "The following tremor power aggregates are derived: ['mode', 'median', '90p'].\n",
      "{'aggregated_tremor_measures': {'90p_tremor_power': 1.3259483071516063,\n",
      "                                'median_tremor_power': 0.5143985314908104,\n",
      "                                'modal_tremor_power': 0.3,\n",
      "                                'perc_windows_tremor': 19.386769676484793},\n",
      " 'metadata': {'nr_valid_days': 1,\n",
      "              'nr_windows_rest': 8284,\n",
      "              'nr_windows_total': 12600}}\n"
     ]
    }
   ],
   "source": [
    "import pprint\n",
    "from paradigma.util import select_hours, select_days\n",
    "from paradigma.pipelines.tremor_pipeline import aggregate_tremor\n",
    "\n",
    "select_hours_start = '08:00' # you can specifiy the hours and minutes here\n",
    "select_hours_end = '22:00'\n",
    "min_hours_per_day = 10\n",
    "\n",
    "print(f'Before aggregation we select data collected between {select_hours_start} \\\n",
    "and {select_hours_end}. We also select days with at least {min_hours_per_day} hours of data.')\n",
    "print(f'The following tremor power aggregates are derived: {config.aggregates_tremor_power}.')\n",
    "\n",
    "# Select the hours that should be included in the analysis\n",
    "df_quantification = select_hours(df_quantification, select_hours_start, select_hours_end)\n",
    "\n",
    "# Remove days with less than the specified minimum amount of hours\n",
    "df_quantification = select_days(df_quantification, min_hours_per_day)\n",
    "\n",
    "# Compute the aggregated measures\n",
    "config = TremorConfig()\n",
    "d_tremor_aggregates = aggregate_tremor(df = df_quantification, config = config)\n",
    "\n",
    "pprint.pprint(d_tremor_aggregates)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "paradigma-cfWEGyqZ-py3.11",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}