Preprocessing Raw Gaze Data#
What you will learn in this tutorial:#
how to transform pixel coordinates into degrees of visual angle
how to transform positional data into velocity data
Preparations#
We import pymovements as the alias pm for convenience.
[1]:
import pymovements as pm
/home/docs/checkouts/readthedocs.org/user_builds/pymovements/envs/v0.13.0/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm
Let’s start by downloading our ToyDataset and loading in its data:
[2]:
dataset = pm.Dataset('ToyDataset', path='data/ToyDataset')
dataset.download()
dataset.load()
Using already downloaded and verified file: data/ToyDataset/downloads/pymovements-toy-dataset.zip
Extracting pymovements-toy-dataset.zip to data/ToyDataset/raw
100%|██████████| 20/20 [00:00<00:00, 192.20it/s]
[2]:
<pymovements.dataset.dataset.Dataset at 0x7f3f86894d30>
We can verify that all files have been loaded in by checking the fileinfo attribute:
[3]:
dataset.fileinfo
[3]:
| text_id | page_id | filepath |
|---|---|---|
| i64 | i64 | str |
| 0 | 1 | "aeye-lab-pymov… |
| 0 | 2 | "aeye-lab-pymov… |
| 0 | 3 | "aeye-lab-pymov… |
| 0 | 4 | "aeye-lab-pymov… |
| 0 | 5 | "aeye-lab-pymov… |
| 1 | 1 | "aeye-lab-pymov… |
| 1 | 2 | "aeye-lab-pymov… |
| 1 | 3 | "aeye-lab-pymov… |
| 1 | 4 | "aeye-lab-pymov… |
| 1 | 5 | "aeye-lab-pymov… |
| 2 | 1 | "aeye-lab-pymov… |
| 2 | 2 | "aeye-lab-pymov… |
| 2 | 3 | "aeye-lab-pymov… |
| 2 | 4 | "aeye-lab-pymov… |
| 2 | 5 | "aeye-lab-pymov… |
| 3 | 1 | "aeye-lab-pymov… |
| 3 | 2 | "aeye-lab-pymov… |
| 3 | 3 | "aeye-lab-pymov… |
| 3 | 4 | "aeye-lab-pymov… |
| 3 | 5 | "aeye-lab-pymov… |
Now let’s inpect our gaze dataframe:
[4]:
dataset.gaze[0].frame.head()
[4]:
| text_id | page_id | time | x_right_pix | y_right_pix |
|---|---|---|---|---|
| i64 | i64 | f64 | f64 | f64 |
| 0 | 1 | 1.988145e6 | 206.8 | 152.4 |
| 0 | 1 | 1.988146e6 | 206.9 | 152.1 |
| 0 | 1 | 1.988147e6 | 207.0 | 151.8 |
| 0 | 1 | 1.988148e6 | 207.1 | 151.7 |
| 0 | 1 | 1.988149e6 | 207.0 | 151.5 |
Apart from some additional labels we see the following columns: time, x_right_pix and y_right_pix.
Preprocessing#
We now want to transform these pixel position coordinates into coordinates in degrees of visual angle. This is simply done by:
[5]:
dataset.pix2deg()
dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 698.11it/s]
[5]:
| text_id | page_id | time | x_right_pix | y_right_pix | x_right_pos | y_right_pos |
|---|---|---|---|---|---|---|
| i64 | i64 | f64 | f64 | f64 | f64 | f64 |
| 0 | 1 | 1.988145e6 | 206.8 | 152.4 | -10.697598 | -8.852399 |
| 0 | 1 | 1.988146e6 | 206.9 | 152.1 | -10.695183 | -8.859678 |
| 0 | 1 | 1.988147e6 | 207.0 | 151.8 | -10.692768 | -8.866956 |
| 0 | 1 | 1.988148e6 | 207.1 | 151.7 | -10.690352 | -8.869381 |
| 0 | 1 | 1.988149e6 | 207.0 | 151.5 | -10.692768 | -8.874233 |
| 0 | 1 | 1.98815e6 | 207.0 | 151.3 | -10.692768 | -8.879085 |
| 0 | 1 | 1.988151e6 | 207.2 | 151.4 | -10.687937 | -8.876659 |
| 0 | 1 | 1.988152e6 | 207.4 | 151.6 | -10.683106 | -8.871807 |
| 0 | 1 | 1.988153e6 | 207.6 | 151.9 | -10.678275 | -8.86453 |
| 0 | 1 | 1.988154e6 | 207.7 | 152.1 | -10.67586 | -8.859678 |
| 0 | 1 | 1.988155e6 | 207.7 | 152.1 | -10.67586 | -8.859678 |
| 0 | 1 | 1.988156e6 | 207.7 | 152.2 | -10.67586 | -8.857252 |
| … | … | … | … | … | … | … |
| 0 | 1 | 2.005356e6 | 370.4 | 419.0 | -6.700617 | -2.297363 |
| 0 | 1 | 2.005357e6 | 371.2 | 419.0 | -6.680877 | -2.297363 |
| 0 | 1 | 2.005358e6 | 371.1 | 418.9 | -6.683345 | -2.299844 |
| 0 | 1 | 2.005359e6 | 369.9 | 418.7 | -6.712953 | -2.304806 |
| 0 | 1 | 2.00536e6 | 368.1 | 418.1 | -6.75736 | -2.319691 |
| 0 | 1 | 2.005361e6 | 365.9 | 417.1 | -6.811623 | -2.3445 |
| 0 | 1 | 2.005362e6 | 363.3 | 416.3 | -6.875737 | -2.364346 |
| 0 | 1 | 2.005363e6 | 361.0 | 415.4 | -6.932438 | -2.386672 |
| 0 | 1 | 2.005364e6 | 358.0 | 414.5 | -7.006376 | -2.408998 |
| 0 | 1 | 2.005365e6 | 355.8 | 413.8 | -7.060582 | -2.426362 |
| 0 | 1 | 2.005366e6 | 353.1 | 413.2 | -7.12709 | -2.441245 |
| 0 | 1 | 2.005367e6 | 351.2 | 412.9 | -7.173881 | -2.448686 |
The processed result has been added as new columns to our gaze dataframe: x_right_pos, y_right_pos.
Additionally we would like to have velocity data available too. We have four different methods available:
preceding: this will just take the single preceding sample in account for velocity calculation. Most noisy variant.neighbors: this will take the neighboring samples in account for velocity calculation. A bit less noisy.smooth: this will increase the neighboring samples to two on each side. You can get a smooth conversion this way.savitzky_golay: this is using the Savitzky-Golay differentiation filter for conversion. You can specify additional parameters likewindow_lengthandpolyorder. Depending on your parameters this will lead to the best results.
Let’s use the smooth method first:
[6]:
dataset.pos2vel(method='smooth')
dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 627.35it/s]
[6]:
| text_id | page_id | time | x_right_pix | y_right_pix | x_right_pos | y_right_pos | y_right_vel | x_right_vel |
|---|---|---|---|---|---|---|---|---|
| i64 | i64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 |
| 0 | 1 | 1.988145e6 | 206.8 | 152.4 | -10.697598 | -8.852399 | -3.639106 | 1.207626 |
| 0 | 1 | 1.988146e6 | 206.9 | 152.1 | -10.695183 | -8.859678 | -7.278067 | 2.415272 |
| 0 | 1 | 1.988147e6 | 207.0 | 151.8 | -10.692768 | -8.866956 | -5.256267 | 1.610194 |
| 0 | 1 | 1.988148e6 | 207.1 | 151.7 | -10.690352 | -8.869381 | -4.447465 | 0.402548 |
| 0 | 1 | 1.988149e6 | 207.0 | 151.5 | -10.692768 | -8.874233 | -3.234462 | 0.402561 |
| 0 | 1 | 1.98815e6 | 207.0 | 151.3 | -10.692768 | -8.879085 | -0.808615 | 2.012819 |
| 0 | 1 | 1.988151e6 | 207.2 | 151.4 | -10.687937 | -8.876659 | 2.83017 | 4.025683 |
| 0 | 1 | 1.988152e6 | 207.4 | 151.6 | -10.683106 | -8.871807 | 5.256091 | 4.428328 |
| 0 | 1 | 1.988153e6 | 207.6 | 151.9 | -10.678275 | -8.86453 | 4.851847 | 3.220663 |
| 0 | 1 | 1.988154e6 | 207.7 | 152.1 | -10.67586 | -8.859678 | 3.234622 | 1.610354 |
| 0 | 1 | 1.988155e6 | 207.7 | 152.1 | -10.67586 | -8.859678 | 1.617343 | -2.9606e-13 |
| 0 | 1 | 1.988156e6 | 207.7 | 152.2 | -10.67586 | -8.857252 | 1.213025 | -0.805187 |
| … | … | … | … | … | … | … | … | … |
| 0 | 1 | 2.005356e6 | 370.4 | 419.0 | -6.700617 | -2.297363 | 1.653971 | 30.837758 |
| 0 | 1 | 2.005357e6 | 371.2 | 419.0 | -6.680877 | -2.297363 | -1.240481 | 7.401726 |
| 0 | 1 | 2.005358e6 | 371.1 | 418.9 | -6.683345 | -2.299844 | -4.961884 | -14.803188 |
| 0 | 1 | 2.005359e6 | 369.9 | 418.7 | -6.712953 | -2.304806 | -11.164066 | -34.126826 |
| 0 | 1 | 2.00536e6 | 368.1 | 418.1 | -6.75736 | -2.319691 | -17.366038 | -48.510256 |
| 0 | 1 | 2.005361e6 | 365.9 | 417.1 | -6.811623 | -2.3445 | -21.086876 | -56.310241 |
| 0 | 1 | 2.005362e6 | 363.3 | 416.3 | -6.875737 | -2.364346 | -21.913173 | -61.63851 |
| 0 | 1 | 2.005363e6 | 361.0 | 415.4 | -6.932438 | -2.386672 | -21.085616 | -63.266374 |
| 0 | 1 | 2.005364e6 | 358.0 | 414.5 | -7.006376 | -2.408998 | -19.431326 | -63.249652 |
| 0 | 1 | 2.005365e6 | 355.8 | 413.8 | -7.060582 | -2.426362 | -15.710061 | -60.359624 |
| 0 | 1 | 2.005366e6 | 353.1 | 413.2 | -7.12709 | -2.441245 | -11.162127 | -56.649541 |
| 0 | 1 | 2.005367e6 | 351.2 | 412.9 | -7.173881 | -2.448686 | -3.720668 | -23.395331 |
This added the following new velocity columns to our gaze dataframe: x_right_vel, y_right_vel
We can also use the Savitzky-Golay differentiation filter with some additional parameters like this:
[7]:
dataset.pos2vel(method='savitzky_golay', window_length=7, polyorder=2)
dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 414.79it/s]
[7]:
| text_id | page_id | time | x_right_pix | y_right_pix | x_right_pos | y_right_pos | y_right_vel | x_right_vel |
|---|---|---|---|---|---|---|---|---|
| i64 | i64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 |
| 0 | 1 | 1.988145e6 | 206.8 | 152.4 | -10.697598 | -8.852399 | -8.231094 | 1.897693 |
| 0 | 1 | 1.988146e6 | 206.9 | 152.1 | -10.695183 | -8.859678 | -6.902524 | 1.66768 |
| 0 | 1 | 1.988147e6 | 207.0 | 151.8 | -10.692768 | -8.866956 | -5.573953 | 1.437667 |
| 0 | 1 | 1.988148e6 | 207.1 | 151.7 | -10.690352 | -8.869381 | -4.245382 | 1.207654 |
| 0 | 1 | 1.988149e6 | 207.0 | 151.5 | -10.692768 | -8.874233 | -2.339263 | 1.552735 |
| 0 | 1 | 1.98815e6 | 207.0 | 151.3 | -10.692768 | -8.879085 | 0.000009 | 2.242885 |
| 0 | 1 | 1.988151e6 | 207.2 | 151.4 | -10.687937 | -8.876659 | 1.992718 | 2.933036 |
| 0 | 1 | 1.988152e6 | 207.4 | 151.6 | -10.683106 | -8.871807 | 3.378942 | 3.364372 |
| 0 | 1 | 1.988153e6 | 207.6 | 151.9 | -10.678275 | -8.86453 | 3.98543 | 2.933062 |
| 0 | 1 | 1.988154e6 | 207.7 | 152.1 | -10.67586 | -8.859678 | 3.292347 | 1.63908 |
| 0 | 1 | 1.988155e6 | 207.7 | 152.1 | -10.67586 | -8.859678 | 2.425984 | 0.517608 |
| 0 | 1 | 1.988156e6 | 207.7 | 152.2 | -10.67586 | -8.857252 | 0.953079 | -0.25881 |
| … | … | … | … | … | … | … | … | … |
| 0 | 1 | 2.005356e6 | 370.4 | 419.0 | -6.700617 | -2.297363 | 2.215118 | 30.127398 |
| 0 | 1 | 2.005357e6 | 371.2 | 419.0 | -6.680877 | -2.297363 | -1.772092 | 8.10499 |
| 0 | 1 | 2.005358e6 | 371.1 | 418.9 | -6.683345 | -2.299844 | -6.645273 | -12.862729 |
| 0 | 1 | 2.005359e6 | 369.9 | 418.7 | -6.712953 | -2.304806 | -11.252527 | -30.745214 |
| 0 | 1 | 2.00536e6 | 368.1 | 418.1 | -6.75736 | -2.319691 | -15.593809 | -44.219144 |
| 0 | 1 | 2.005361e6 | 365.9 | 417.1 | -6.811623 | -2.3445 | -19.137494 | -54.515696 |
| 0 | 1 | 2.005362e6 | 363.3 | 416.3 | -6.875737 | -2.364346 | -20.909048 | -59.347609 |
| 0 | 1 | 2.005363e6 | 361.0 | 415.4 | -6.932438 | -2.386672 | -20.465552 | -62.062479 |
| 0 | 1 | 2.005364e6 | 358.0 | 414.5 | -7.006376 | -2.408998 | -18.073031 | -61.343786 |
| 0 | 1 | 2.005365e6 | 355.8 | 413.8 | -7.060582 | -2.426362 | -15.473874 | -59.509437 |
| 0 | 1 | 2.005366e6 | 353.1 | 413.2 | -7.12709 | -2.441245 | -12.874717 | -57.675089 |
| 0 | 1 | 2.005367e6 | 351.2 | 412.9 | -7.173881 | -2.448686 | -10.27556 | -55.840741 |
This has overwritten our velocity columns. As we see, the values in the velocity columns are slightly different.
What you have learned in this tutorial:#
transforming pixel coordinates into degrees of visual angle by using
Dataset.pix2deg()transforming positional data into velocity data by using
Dataset.pos2vel()passing additional keyword arguments when using the Savitzky-Golay differentiation filter