Preprocessing Raw Gaze Data#

What you will learn in this tutorial:#

  • how to transform pixel coordinates into degrees of visual angle

  • how to transform positional data into velocity data

Preparations#

We import pymovements as the alias pm for convenience.

[1]:
import pymovements as pm
/home/docs/checkouts/readthedocs.org/user_builds/pymovements/envs/v0.13.0/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

Let’s start by downloading our ToyDataset and loading in its data:

[2]:
dataset = pm.Dataset('ToyDataset', path='data/ToyDataset')
dataset.download()
dataset.load()
Using already downloaded and verified file: data/ToyDataset/downloads/pymovements-toy-dataset.zip
Extracting pymovements-toy-dataset.zip to data/ToyDataset/raw
100%|██████████| 20/20 [00:00<00:00, 192.20it/s]
[2]:
<pymovements.dataset.dataset.Dataset at 0x7f3f86894d30>

We can verify that all files have been loaded in by checking the fileinfo attribute:

[3]:
dataset.fileinfo
[3]:
shape: (20, 3)
text_idpage_idfilepath
i64i64str
01"aeye-lab-pymov…
02"aeye-lab-pymov…
03"aeye-lab-pymov…
04"aeye-lab-pymov…
05"aeye-lab-pymov…
11"aeye-lab-pymov…
12"aeye-lab-pymov…
13"aeye-lab-pymov…
14"aeye-lab-pymov…
15"aeye-lab-pymov…
21"aeye-lab-pymov…
22"aeye-lab-pymov…
23"aeye-lab-pymov…
24"aeye-lab-pymov…
25"aeye-lab-pymov…
31"aeye-lab-pymov…
32"aeye-lab-pymov…
33"aeye-lab-pymov…
34"aeye-lab-pymov…
35"aeye-lab-pymov…

Now let’s inpect our gaze dataframe:

[4]:
dataset.gaze[0].frame.head()
[4]:
shape: (5, 5)
text_idpage_idtimex_right_pixy_right_pix
i64i64f64f64f64
011.988145e6206.8152.4
011.988146e6206.9152.1
011.988147e6207.0151.8
011.988148e6207.1151.7
011.988149e6207.0151.5

Apart from some additional labels we see the following columns: time, x_right_pix and y_right_pix.

Preprocessing#

We now want to transform these pixel position coordinates into coordinates in degrees of visual angle. This is simply done by:

[5]:
dataset.pix2deg()

dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 698.11it/s]
[5]:
shape: (17_223, 7)
text_idpage_idtimex_right_pixy_right_pixx_right_posy_right_pos
i64i64f64f64f64f64f64
011.988145e6206.8152.4-10.697598-8.852399
011.988146e6206.9152.1-10.695183-8.859678
011.988147e6207.0151.8-10.692768-8.866956
011.988148e6207.1151.7-10.690352-8.869381
011.988149e6207.0151.5-10.692768-8.874233
011.98815e6207.0151.3-10.692768-8.879085
011.988151e6207.2151.4-10.687937-8.876659
011.988152e6207.4151.6-10.683106-8.871807
011.988153e6207.6151.9-10.678275-8.86453
011.988154e6207.7152.1-10.67586-8.859678
011.988155e6207.7152.1-10.67586-8.859678
011.988156e6207.7152.2-10.67586-8.857252
012.005356e6370.4419.0-6.700617-2.297363
012.005357e6371.2419.0-6.680877-2.297363
012.005358e6371.1418.9-6.683345-2.299844
012.005359e6369.9418.7-6.712953-2.304806
012.00536e6368.1418.1-6.75736-2.319691
012.005361e6365.9417.1-6.811623-2.3445
012.005362e6363.3416.3-6.875737-2.364346
012.005363e6361.0415.4-6.932438-2.386672
012.005364e6358.0414.5-7.006376-2.408998
012.005365e6355.8413.8-7.060582-2.426362
012.005366e6353.1413.2-7.12709-2.441245
012.005367e6351.2412.9-7.173881-2.448686

The processed result has been added as new columns to our gaze dataframe: x_right_pos, y_right_pos.

Additionally we would like to have velocity data available too. We have four different methods available:

  • preceding: this will just take the single preceding sample in account for velocity calculation. Most noisy variant.

  • neighbors: this will take the neighboring samples in account for velocity calculation. A bit less noisy.

  • smooth: this will increase the neighboring samples to two on each side. You can get a smooth conversion this way.

  • savitzky_golay: this is using the Savitzky-Golay differentiation filter for conversion. You can specify additional parameters like window_length and polyorder. Depending on your parameters this will lead to the best results.

Let’s use the smooth method first:

[6]:
dataset.pos2vel(method='smooth')

dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 627.35it/s]
[6]:
shape: (17_223, 9)
text_idpage_idtimex_right_pixy_right_pixx_right_posy_right_posy_right_velx_right_vel
i64i64f64f64f64f64f64f64f64
011.988145e6206.8152.4-10.697598-8.852399-3.6391061.207626
011.988146e6206.9152.1-10.695183-8.859678-7.2780672.415272
011.988147e6207.0151.8-10.692768-8.866956-5.2562671.610194
011.988148e6207.1151.7-10.690352-8.869381-4.4474650.402548
011.988149e6207.0151.5-10.692768-8.874233-3.2344620.402561
011.98815e6207.0151.3-10.692768-8.879085-0.8086152.012819
011.988151e6207.2151.4-10.687937-8.8766592.830174.025683
011.988152e6207.4151.6-10.683106-8.8718075.2560914.428328
011.988153e6207.6151.9-10.678275-8.864534.8518473.220663
011.988154e6207.7152.1-10.67586-8.8596783.2346221.610354
011.988155e6207.7152.1-10.67586-8.8596781.617343-2.9606e-13
011.988156e6207.7152.2-10.67586-8.8572521.213025-0.805187
012.005356e6370.4419.0-6.700617-2.2973631.65397130.837758
012.005357e6371.2419.0-6.680877-2.297363-1.2404817.401726
012.005358e6371.1418.9-6.683345-2.299844-4.961884-14.803188
012.005359e6369.9418.7-6.712953-2.304806-11.164066-34.126826
012.00536e6368.1418.1-6.75736-2.319691-17.366038-48.510256
012.005361e6365.9417.1-6.811623-2.3445-21.086876-56.310241
012.005362e6363.3416.3-6.875737-2.364346-21.913173-61.63851
012.005363e6361.0415.4-6.932438-2.386672-21.085616-63.266374
012.005364e6358.0414.5-7.006376-2.408998-19.431326-63.249652
012.005365e6355.8413.8-7.060582-2.426362-15.710061-60.359624
012.005366e6353.1413.2-7.12709-2.441245-11.162127-56.649541
012.005367e6351.2412.9-7.173881-2.448686-3.720668-23.395331

This added the following new velocity columns to our gaze dataframe: x_right_vel, y_right_vel

We can also use the Savitzky-Golay differentiation filter with some additional parameters like this:

[7]:
dataset.pos2vel(method='savitzky_golay', window_length=7, polyorder=2)

dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 414.79it/s]
[7]:
shape: (17_223, 9)
text_idpage_idtimex_right_pixy_right_pixx_right_posy_right_posy_right_velx_right_vel
i64i64f64f64f64f64f64f64f64
011.988145e6206.8152.4-10.697598-8.852399-8.2310941.897693
011.988146e6206.9152.1-10.695183-8.859678-6.9025241.66768
011.988147e6207.0151.8-10.692768-8.866956-5.5739531.437667
011.988148e6207.1151.7-10.690352-8.869381-4.2453821.207654
011.988149e6207.0151.5-10.692768-8.874233-2.3392631.552735
011.98815e6207.0151.3-10.692768-8.8790850.0000092.242885
011.988151e6207.2151.4-10.687937-8.8766591.9927182.933036
011.988152e6207.4151.6-10.683106-8.8718073.3789423.364372
011.988153e6207.6151.9-10.678275-8.864533.985432.933062
011.988154e6207.7152.1-10.67586-8.8596783.2923471.63908
011.988155e6207.7152.1-10.67586-8.8596782.4259840.517608
011.988156e6207.7152.2-10.67586-8.8572520.953079-0.25881
012.005356e6370.4419.0-6.700617-2.2973632.21511830.127398
012.005357e6371.2419.0-6.680877-2.297363-1.7720928.10499
012.005358e6371.1418.9-6.683345-2.299844-6.645273-12.862729
012.005359e6369.9418.7-6.712953-2.304806-11.252527-30.745214
012.00536e6368.1418.1-6.75736-2.319691-15.593809-44.219144
012.005361e6365.9417.1-6.811623-2.3445-19.137494-54.515696
012.005362e6363.3416.3-6.875737-2.364346-20.909048-59.347609
012.005363e6361.0415.4-6.932438-2.386672-20.465552-62.062479
012.005364e6358.0414.5-7.006376-2.408998-18.073031-61.343786
012.005365e6355.8413.8-7.060582-2.426362-15.473874-59.509437
012.005366e6353.1413.2-7.12709-2.441245-12.874717-57.675089
012.005367e6351.2412.9-7.173881-2.448686-10.27556-55.840741

This has overwritten our velocity columns. As we see, the values in the velocity columns are slightly different.

What you have learned in this tutorial:#

  • transforming pixel coordinates into degrees of visual angle by using Dataset.pix2deg()

  • transforming positional data into velocity data by using Dataset.pos2vel()

  • passing additional keyword arguments when using the Savitzky-Golay differentiation filter