Preprocessing Raw Gaze Data#
What you will learn in this tutorial:#
how to transform pixel coordinates into degrees of visual angle
how to transform positional data into velocity data
Preparations#
We import pymovements as the alias pm for convenience.
[1]:
import pymovements as pm
/home/docs/checkouts/readthedocs.org/user_builds/pymovements/envs/v0.7.0/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm
Let’s start by downloading and extracting our ToyDataset:
[2]:
dataset = pm.datasets.ToyDataset(root='data/')
dataset.download()
dataset.extract()
dataset.load()
Downloading http://github.com/aeye-lab/pymovements-toy-dataset/zipball/6cb5d663317bf418cec0c9abe1dde5085a8a8ebd/ to data/ToyDataset/downloads/pymovements-toy-dataset.zip
pymovements-toy-dataset.zip: 100%|██████████| 3.06M/3.06M [00:00<00:00, 23.8MB/s]
100%|██████████| 20/20 [00:00<00:00, 200.88it/s]
We can verify that all files have been loaded in by checking the fileinfo attribute:
[3]:
dataset.fileinfo
[3]:
| text_id | page_id | filepath |
|---|---|---|
| i64 | i64 | str |
| 0 | 1 | "aeye-lab-pymov... |
| 0 | 2 | "aeye-lab-pymov... |
| 0 | 3 | "aeye-lab-pymov... |
| 0 | 4 | "aeye-lab-pymov... |
| 0 | 5 | "aeye-lab-pymov... |
| 1 | 1 | "aeye-lab-pymov... |
| 1 | 2 | "aeye-lab-pymov... |
| 1 | 3 | "aeye-lab-pymov... |
| 1 | 4 | "aeye-lab-pymov... |
| 1 | 5 | "aeye-lab-pymov... |
| 2 | 1 | "aeye-lab-pymov... |
| 2 | 2 | "aeye-lab-pymov... |
| 2 | 3 | "aeye-lab-pymov... |
| 2 | 4 | "aeye-lab-pymov... |
| 2 | 5 | "aeye-lab-pymov... |
| 3 | 1 | "aeye-lab-pymov... |
| 3 | 2 | "aeye-lab-pymov... |
| 3 | 3 | "aeye-lab-pymov... |
| 3 | 4 | "aeye-lab-pymov... |
| 3 | 5 | "aeye-lab-pymov... |
Now let’s inpect our gaze dataframe:
[4]:
dataset.gaze[0].frame.head()
[4]:
| text_id | page_id | time | x_right_pix | y_right_pix |
|---|---|---|---|---|
| i64 | i64 | f64 | f64 | f64 |
| 0 | 1 | 1.988145e6 | 206.8 | 152.4 |
| 0 | 1 | 1.988146e6 | 206.9 | 152.1 |
| 0 | 1 | 1.988147e6 | 207.0 | 151.8 |
| 0 | 1 | 1.988148e6 | 207.1 | 151.7 |
| 0 | 1 | 1.988149e6 | 207.0 | 151.5 |
Apart from some additional labels we see the following columns: time, x_right_pix and y_right_pix.
Preprocessing#
We now want to transform these pixel position coordinates into coordinates in degrees of visual angle. This is simply done by:
[5]:
dataset.pix2deg()
dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 720.99it/s]
[5]:
| text_id | page_id | time | x_right_pix | y_right_pix | x_right_pos | y_right_pos |
|---|---|---|---|---|---|---|
| i64 | i64 | f64 | f64 | f64 | f64 | f64 |
| 0 | 1 | 1.988145e6 | 206.8 | 152.4 | -10.697598 | -8.852399 |
| 0 | 1 | 1.988146e6 | 206.9 | 152.1 | -10.695183 | -8.859678 |
| 0 | 1 | 1.988147e6 | 207.0 | 151.8 | -10.692768 | -8.866956 |
| 0 | 1 | 1.988148e6 | 207.1 | 151.7 | -10.690352 | -8.869381 |
| 0 | 1 | 1.988149e6 | 207.0 | 151.5 | -10.692768 | -8.874233 |
| 0 | 1 | 1.98815e6 | 207.0 | 151.3 | -10.692768 | -8.879085 |
| 0 | 1 | 1.988151e6 | 207.2 | 151.4 | -10.687937 | -8.876659 |
| 0 | 1 | 1.988152e6 | 207.4 | 151.6 | -10.683106 | -8.871807 |
| 0 | 1 | 1.988153e6 | 207.6 | 151.9 | -10.678275 | -8.86453 |
| 0 | 1 | 1.988154e6 | 207.7 | 152.1 | -10.67586 | -8.859678 |
| 0 | 1 | 1.988155e6 | 207.7 | 152.1 | -10.67586 | -8.859678 |
| 0 | 1 | 1.988156e6 | 207.7 | 152.2 | -10.67586 | -8.857252 |
| ... | ... | ... | ... | ... | ... | ... |
| 0 | 1 | 2.005356e6 | 370.4 | 419.0 | -6.700617 | -2.297363 |
| 0 | 1 | 2.005357e6 | 371.2 | 419.0 | -6.680877 | -2.297363 |
| 0 | 1 | 2.005358e6 | 371.1 | 418.9 | -6.683345 | -2.299844 |
| 0 | 1 | 2.005359e6 | 369.9 | 418.7 | -6.712953 | -2.304806 |
| 0 | 1 | 2.00536e6 | 368.1 | 418.1 | -6.75736 | -2.319691 |
| 0 | 1 | 2.005361e6 | 365.9 | 417.1 | -6.811623 | -2.3445 |
| 0 | 1 | 2.005362e6 | 363.3 | 416.3 | -6.875737 | -2.364346 |
| 0 | 1 | 2.005363e6 | 361.0 | 415.4 | -6.932438 | -2.386672 |
| 0 | 1 | 2.005364e6 | 358.0 | 414.5 | -7.006376 | -2.408998 |
| 0 | 1 | 2.005365e6 | 355.8 | 413.8 | -7.060582 | -2.426362 |
| 0 | 1 | 2.005366e6 | 353.1 | 413.2 | -7.12709 | -2.441245 |
| 0 | 1 | 2.005367e6 | 351.2 | 412.9 | -7.173881 | -2.448686 |
The processed result has been added as new columns to our gaze dataframe: x_right_pos, y_right_pos.
Additionally we would like to have velocity data available too. We have four different methods available:
preceding: this will just take the single preceding sample in account for velocity calculation. Most noisy variant.neighbors: this will take the neighboring samples in account for velocity calculation. A bit less noisy.smooth: this will increase the neighboring samples to two on each side. You can get a smooth conversion this way.savitzky_golay: this is using the Savitzky-Golay differentiation filter for conversion. You can specify additional parameters likewindow_lengthandpolyorder. Depending on your parameters this will lead to the best results.
Let’s use the smooth method first:
[6]:
dataset.pos2vel(method='smooth')
dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 663.60it/s]
[6]:
| text_id | page_id | time | x_right_pix | y_right_pix | x_right_pos | y_right_pos | x_right_vel | y_right_vel |
|---|---|---|---|---|---|---|---|---|
| i64 | i64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 |
| 0 | 1 | 1.988145e6 | 206.8 | 152.4 | -10.697598 | -8.852399 | 1.207626 | -3.639106 |
| 0 | 1 | 1.988146e6 | 206.9 | 152.1 | -10.695183 | -8.859678 | 2.415272 | -7.278067 |
| 0 | 1 | 1.988147e6 | 207.0 | 151.8 | -10.692768 | -8.866956 | 1.610194 | -5.256267 |
| 0 | 1 | 1.988148e6 | 207.1 | 151.7 | -10.690352 | -8.869381 | 0.402548 | -4.447465 |
| 0 | 1 | 1.988149e6 | 207.0 | 151.5 | -10.692768 | -8.874233 | 0.402561 | -3.234462 |
| 0 | 1 | 1.98815e6 | 207.0 | 151.3 | -10.692768 | -8.879085 | 2.012819 | -0.808615 |
| 0 | 1 | 1.988151e6 | 207.2 | 151.4 | -10.687937 | -8.876659 | 4.025683 | 2.83017 |
| 0 | 1 | 1.988152e6 | 207.4 | 151.6 | -10.683106 | -8.871807 | 4.428328 | 5.256091 |
| 0 | 1 | 1.988153e6 | 207.6 | 151.9 | -10.678275 | -8.86453 | 3.220663 | 4.851847 |
| 0 | 1 | 1.988154e6 | 207.7 | 152.1 | -10.67586 | -8.859678 | 1.610354 | 3.234622 |
| 0 | 1 | 1.988155e6 | 207.7 | 152.1 | -10.67586 | -8.859678 | -2.9606e-13 | 1.617343 |
| 0 | 1 | 1.988156e6 | 207.7 | 152.2 | -10.67586 | -8.857252 | -0.805187 | 1.213025 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 0 | 1 | 2.005356e6 | 370.4 | 419.0 | -6.700617 | -2.297363 | 30.837758 | 1.653971 |
| 0 | 1 | 2.005357e6 | 371.2 | 419.0 | -6.680877 | -2.297363 | 7.401726 | -1.240481 |
| 0 | 1 | 2.005358e6 | 371.1 | 418.9 | -6.683345 | -2.299844 | -14.803188 | -4.961884 |
| 0 | 1 | 2.005359e6 | 369.9 | 418.7 | -6.712953 | -2.304806 | -34.126826 | -11.164066 |
| 0 | 1 | 2.00536e6 | 368.1 | 418.1 | -6.75736 | -2.319691 | -48.510256 | -17.366038 |
| 0 | 1 | 2.005361e6 | 365.9 | 417.1 | -6.811623 | -2.3445 | -56.310241 | -21.086876 |
| 0 | 1 | 2.005362e6 | 363.3 | 416.3 | -6.875737 | -2.364346 | -61.63851 | -21.913173 |
| 0 | 1 | 2.005363e6 | 361.0 | 415.4 | -6.932438 | -2.386672 | -63.266374 | -21.085616 |
| 0 | 1 | 2.005364e6 | 358.0 | 414.5 | -7.006376 | -2.408998 | -63.249652 | -19.431326 |
| 0 | 1 | 2.005365e6 | 355.8 | 413.8 | -7.060582 | -2.426362 | -60.359624 | -15.710061 |
| 0 | 1 | 2.005366e6 | 353.1 | 413.2 | -7.12709 | -2.441245 | -56.649541 | -11.162127 |
| 0 | 1 | 2.005367e6 | 351.2 | 412.9 | -7.173881 | -2.448686 | -23.395331 | -3.720668 |
This added the following new velocity columns to our gaze dataframe: x_right_vel, y_right_vel
We can also use the Savitzky-Golay differentiation filter with some additional parameters like this:
[7]:
dataset.pos2vel(method='savitzky_golay', window_length=7, polyorder=2)
dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 406.50it/s]
[7]:
| text_id | page_id | time | x_right_pix | y_right_pix | x_right_pos | y_right_pos | x_right_vel | y_right_vel |
|---|---|---|---|---|---|---|---|---|
| i64 | i64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 |
| 0 | 1 | 1.988145e6 | 206.8 | 152.4 | -10.697598 | -8.852399 | 1.897693 | -8.231094 |
| 0 | 1 | 1.988146e6 | 206.9 | 152.1 | -10.695183 | -8.859678 | 1.66768 | -6.902524 |
| 0 | 1 | 1.988147e6 | 207.0 | 151.8 | -10.692768 | -8.866956 | 1.437667 | -5.573953 |
| 0 | 1 | 1.988148e6 | 207.1 | 151.7 | -10.690352 | -8.869381 | 1.207654 | -4.245382 |
| 0 | 1 | 1.988149e6 | 207.0 | 151.5 | -10.692768 | -8.874233 | 1.552735 | -2.339263 |
| 0 | 1 | 1.98815e6 | 207.0 | 151.3 | -10.692768 | -8.879085 | 2.242885 | 0.000009 |
| 0 | 1 | 1.988151e6 | 207.2 | 151.4 | -10.687937 | -8.876659 | 2.933036 | 1.992718 |
| 0 | 1 | 1.988152e6 | 207.4 | 151.6 | -10.683106 | -8.871807 | 3.364372 | 3.378942 |
| 0 | 1 | 1.988153e6 | 207.6 | 151.9 | -10.678275 | -8.86453 | 2.933062 | 3.98543 |
| 0 | 1 | 1.988154e6 | 207.7 | 152.1 | -10.67586 | -8.859678 | 1.63908 | 3.292347 |
| 0 | 1 | 1.988155e6 | 207.7 | 152.1 | -10.67586 | -8.859678 | 0.517608 | 2.425984 |
| 0 | 1 | 1.988156e6 | 207.7 | 152.2 | -10.67586 | -8.857252 | -0.25881 | 0.953079 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 0 | 1 | 2.005356e6 | 370.4 | 419.0 | -6.700617 | -2.297363 | 30.127398 | 2.215118 |
| 0 | 1 | 2.005357e6 | 371.2 | 419.0 | -6.680877 | -2.297363 | 8.10499 | -1.772092 |
| 0 | 1 | 2.005358e6 | 371.1 | 418.9 | -6.683345 | -2.299844 | -12.862729 | -6.645273 |
| 0 | 1 | 2.005359e6 | 369.9 | 418.7 | -6.712953 | -2.304806 | -30.745214 | -11.252527 |
| 0 | 1 | 2.00536e6 | 368.1 | 418.1 | -6.75736 | -2.319691 | -44.219144 | -15.593809 |
| 0 | 1 | 2.005361e6 | 365.9 | 417.1 | -6.811623 | -2.3445 | -54.515696 | -19.137494 |
| 0 | 1 | 2.005362e6 | 363.3 | 416.3 | -6.875737 | -2.364346 | -59.347609 | -20.909048 |
| 0 | 1 | 2.005363e6 | 361.0 | 415.4 | -6.932438 | -2.386672 | -62.062479 | -20.465552 |
| 0 | 1 | 2.005364e6 | 358.0 | 414.5 | -7.006376 | -2.408998 | -61.343786 | -18.073031 |
| 0 | 1 | 2.005365e6 | 355.8 | 413.8 | -7.060582 | -2.426362 | -59.509437 | -15.473874 |
| 0 | 1 | 2.005366e6 | 353.1 | 413.2 | -7.12709 | -2.441245 | -57.675089 | -12.874717 |
| 0 | 1 | 2.005367e6 | 351.2 | 412.9 | -7.173881 | -2.448686 | -55.840741 | -10.27556 |
This has overwritten our velocity columns. As we see, the values in the velocity columns are slightly different.
What you have learned in this tutorial:#
transforming pixel coordinates into degrees of visual angle by using
Dataset.pix2deg()transforming positional data into velocity data by using
Dataset.pos2vel()passing additional keyword arguments when using the Savitzky-Golay differentiation filter