Preprocessing Raw Gaze Data#

What you will learn in this tutorial:#

  • how to transform pixel coordinates into degrees of visual angle

  • how to transform positional data into velocity data

Preparations#

We import pymovements as the alias pm for convenience.

[1]:
import pymovements as pm
/home/docs/checkouts/readthedocs.org/user_builds/pymovements/envs/v0.9.0/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

Let’s start by downloading our ToyDataset and loading in its data:

[2]:
dataset = pm.Dataset('ToyDataset', path='data/ToyDataset')
dataset.download()
dataset.load()
Using already downloaded and verified file: data/ToyDataset/downloads/pymovements-toy-dataset.zip
100%|██████████| 20/20 [00:00<00:00, 207.37it/s]
[2]:
<pymovements.dataset.dataset.Dataset at 0x7f7b0ea12ac0>

We can verify that all files have been loaded in by checking the fileinfo attribute:

[3]:
dataset.fileinfo
[3]:
shape: (20, 3)
text_idpage_idfilepath
i64i64str
01"aeye-lab-pymov…
02"aeye-lab-pymov…
03"aeye-lab-pymov…
04"aeye-lab-pymov…
05"aeye-lab-pymov…
11"aeye-lab-pymov…
12"aeye-lab-pymov…
13"aeye-lab-pymov…
14"aeye-lab-pymov…
15"aeye-lab-pymov…
21"aeye-lab-pymov…
22"aeye-lab-pymov…
23"aeye-lab-pymov…
24"aeye-lab-pymov…
25"aeye-lab-pymov…
31"aeye-lab-pymov…
32"aeye-lab-pymov…
33"aeye-lab-pymov…
34"aeye-lab-pymov…
35"aeye-lab-pymov…

Now let’s inpect our gaze dataframe:

[4]:
dataset.gaze[0].frame.head()
[4]:
shape: (5, 5)
text_idpage_idtimex_right_pixy_right_pix
i64i64f64f64f64
011.988145e6206.8152.4
011.988146e6206.9152.1
011.988147e6207.0151.8
011.988148e6207.1151.7
011.988149e6207.0151.5

Apart from some additional labels we see the following columns: time, x_right_pix and y_right_pix.

Preprocessing#

We now want to transform these pixel position coordinates into coordinates in degrees of visual angle. This is simply done by:

[5]:
dataset.pix2deg()

dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 705.52it/s]
[5]:
shape: (17223, 7)
text_idpage_idtimex_right_pixy_right_pixx_right_posy_right_pos
i64i64f64f64f64f64f64
011.988145e6206.8152.4-10.697598-8.852399
011.988146e6206.9152.1-10.695183-8.859678
011.988147e6207.0151.8-10.692768-8.866956
011.988148e6207.1151.7-10.690352-8.869381
011.988149e6207.0151.5-10.692768-8.874233
011.98815e6207.0151.3-10.692768-8.879085
011.988151e6207.2151.4-10.687937-8.876659
011.988152e6207.4151.6-10.683106-8.871807
011.988153e6207.6151.9-10.678275-8.86453
011.988154e6207.7152.1-10.67586-8.859678
011.988155e6207.7152.1-10.67586-8.859678
011.988156e6207.7152.2-10.67586-8.857252
012.005356e6370.4419.0-6.700617-2.297363
012.005357e6371.2419.0-6.680877-2.297363
012.005358e6371.1418.9-6.683345-2.299844
012.005359e6369.9418.7-6.712953-2.304806
012.00536e6368.1418.1-6.75736-2.319691
012.005361e6365.9417.1-6.811623-2.3445
012.005362e6363.3416.3-6.875737-2.364346
012.005363e6361.0415.4-6.932438-2.386672
012.005364e6358.0414.5-7.006376-2.408998
012.005365e6355.8413.8-7.060582-2.426362
012.005366e6353.1413.2-7.12709-2.441245
012.005367e6351.2412.9-7.173881-2.448686

The processed result has been added as new columns to our gaze dataframe: x_right_pos, y_right_pos.

Additionally we would like to have velocity data available too. We have four different methods available:

  • preceding: this will just take the single preceding sample in account for velocity calculation. Most noisy variant.

  • neighbors: this will take the neighboring samples in account for velocity calculation. A bit less noisy.

  • smooth: this will increase the neighboring samples to two on each side. You can get a smooth conversion this way.

  • savitzky_golay: this is using the Savitzky-Golay differentiation filter for conversion. You can specify additional parameters like window_length and polyorder. Depending on your parameters this will lead to the best results.

Let’s use the smooth method first:

[6]:
dataset.pos2vel(method='smooth')

dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 674.98it/s]
[6]:
shape: (17223, 9)
text_idpage_idtimex_right_pixy_right_pixx_right_posy_right_posx_right_vely_right_vel
i64i64f64f64f64f64f64f64f64
011.988145e6206.8152.4-10.697598-8.8523991.207626-3.639106
011.988146e6206.9152.1-10.695183-8.8596782.415272-7.278067
011.988147e6207.0151.8-10.692768-8.8669561.610194-5.256267
011.988148e6207.1151.7-10.690352-8.8693810.402548-4.447465
011.988149e6207.0151.5-10.692768-8.8742330.402561-3.234462
011.98815e6207.0151.3-10.692768-8.8790852.012819-0.808615
011.988151e6207.2151.4-10.687937-8.8766594.0256832.83017
011.988152e6207.4151.6-10.683106-8.8718074.4283285.256091
011.988153e6207.6151.9-10.678275-8.864533.2206634.851847
011.988154e6207.7152.1-10.67586-8.8596781.6103543.234622
011.988155e6207.7152.1-10.67586-8.859678-2.9606e-131.617343
011.988156e6207.7152.2-10.67586-8.857252-0.8051871.213025
012.005356e6370.4419.0-6.700617-2.29736330.8377581.653971
012.005357e6371.2419.0-6.680877-2.2973637.401726-1.240481
012.005358e6371.1418.9-6.683345-2.299844-14.803188-4.961884
012.005359e6369.9418.7-6.712953-2.304806-34.126826-11.164066
012.00536e6368.1418.1-6.75736-2.319691-48.510256-17.366038
012.005361e6365.9417.1-6.811623-2.3445-56.310241-21.086876
012.005362e6363.3416.3-6.875737-2.364346-61.63851-21.913173
012.005363e6361.0415.4-6.932438-2.386672-63.266374-21.085616
012.005364e6358.0414.5-7.006376-2.408998-63.249652-19.431326
012.005365e6355.8413.8-7.060582-2.426362-60.359624-15.710061
012.005366e6353.1413.2-7.12709-2.441245-56.649541-11.162127
012.005367e6351.2412.9-7.173881-2.448686-23.395331-3.720668

This added the following new velocity columns to our gaze dataframe: x_right_vel, y_right_vel

We can also use the Savitzky-Golay differentiation filter with some additional parameters like this:

[7]:
dataset.pos2vel(method='savitzky_golay', window_length=7, polyorder=2)

dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 413.35it/s]
[7]:
shape: (17223, 9)
text_idpage_idtimex_right_pixy_right_pixx_right_posy_right_posx_right_vely_right_vel
i64i64f64f64f64f64f64f64f64
011.988145e6206.8152.4-10.697598-8.8523991.897693-8.231094
011.988146e6206.9152.1-10.695183-8.8596781.66768-6.902524
011.988147e6207.0151.8-10.692768-8.8669561.437667-5.573953
011.988148e6207.1151.7-10.690352-8.8693811.207654-4.245382
011.988149e6207.0151.5-10.692768-8.8742331.552735-2.339263
011.98815e6207.0151.3-10.692768-8.8790852.2428850.000009
011.988151e6207.2151.4-10.687937-8.8766592.9330361.992718
011.988152e6207.4151.6-10.683106-8.8718073.3643723.378942
011.988153e6207.6151.9-10.678275-8.864532.9330623.98543
011.988154e6207.7152.1-10.67586-8.8596781.639083.292347
011.988155e6207.7152.1-10.67586-8.8596780.5176082.425984
011.988156e6207.7152.2-10.67586-8.857252-0.258810.953079
012.005356e6370.4419.0-6.700617-2.29736330.1273982.215118
012.005357e6371.2419.0-6.680877-2.2973638.10499-1.772092
012.005358e6371.1418.9-6.683345-2.299844-12.862729-6.645273
012.005359e6369.9418.7-6.712953-2.304806-30.745214-11.252527
012.00536e6368.1418.1-6.75736-2.319691-44.219144-15.593809
012.005361e6365.9417.1-6.811623-2.3445-54.515696-19.137494
012.005362e6363.3416.3-6.875737-2.364346-59.347609-20.909048
012.005363e6361.0415.4-6.932438-2.386672-62.062479-20.465552
012.005364e6358.0414.5-7.006376-2.408998-61.343786-18.073031
012.005365e6355.8413.8-7.060582-2.426362-59.509437-15.473874
012.005366e6353.1413.2-7.12709-2.441245-57.675089-12.874717
012.005367e6351.2412.9-7.173881-2.448686-55.840741-10.27556

This has overwritten our velocity columns. As we see, the values in the velocity columns are slightly different.

What you have learned in this tutorial:#

  • transforming pixel coordinates into degrees of visual angle by using Dataset.pix2deg()

  • transforming positional data into velocity data by using Dataset.pos2vel()

  • passing additional keyword arguments when using the Savitzky-Golay differentiation filter