Preprocessing Raw Gaze Data#

What you will learn in this tutorial:#

  • how to transform pixel coordinates into degrees of visual angle

  • how to transform positional data into velocity data

Preparations#

We import pymovements as the alias pm for convenience.

[1]:
import pymovements as pm
/home/docs/checkouts/readthedocs.org/user_builds/pymovements/envs/v0.16.1/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

Let’s start by downloading our ToyDataset and loading in its data:

[2]:
dataset = pm.Dataset('ToyDataset', path='data/ToyDataset')
dataset.download()
dataset.load()
Using already downloaded and verified file: data/ToyDataset/downloads/pymovements-toy-dataset.zip
Extracting pymovements-toy-dataset.zip to data/ToyDataset/raw
100%|██████████| 20/20 [00:00<00:00, 45.33it/s]
[2]:
<pymovements.dataset.dataset.Dataset at 0x7f410cde8af0>

We can verify that all files have been loaded in by checking the fileinfo attribute:

[3]:
dataset.fileinfo
[3]:
shape: (20, 3)
text_idpage_idfilepath
i64i64str
01"aeye-lab-pymov…
02"aeye-lab-pymov…
03"aeye-lab-pymov…
04"aeye-lab-pymov…
05"aeye-lab-pymov…
11"aeye-lab-pymov…
12"aeye-lab-pymov…
13"aeye-lab-pymov…
14"aeye-lab-pymov…
15"aeye-lab-pymov…
21"aeye-lab-pymov…
22"aeye-lab-pymov…
23"aeye-lab-pymov…
24"aeye-lab-pymov…
25"aeye-lab-pymov…
31"aeye-lab-pymov…
32"aeye-lab-pymov…
33"aeye-lab-pymov…
34"aeye-lab-pymov…
35"aeye-lab-pymov…

Now let’s inpect our gaze dataframe:

[4]:
dataset.gaze[0].frame.head()
[4]:
shape: (5, 6)
text_idpage_idtimestimuli_xstimuli_ypixel
i64i64f64f64f64list[f64]
011.988145e6-1.0-1.0[206.8, 152.4]
011.988146e6-1.0-1.0[206.9, 152.1]
011.988147e6-1.0-1.0[207.0, 151.8]
011.988148e6-1.0-1.0[207.1, 151.7]
011.988149e6-1.0-1.0[207.0, 151.5]

Apart from some trial identifier columns we see the columns time and pixel.

Preprocessing#

We now want to transform these pixel position coordinates into coordinates in degrees of visual angle. This is simply done by:

[5]:
dataset.pix2deg()

dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 22.38it/s]
[5]:
shape: (17_223, 7)
text_idpage_idtimestimuli_xstimuli_ypixelposition
i64i64f64f64f64list[f64]list[f64]
011.988145e6-1.0-1.0[206.8, 152.4][-10.697598, -8.852399]
011.988146e6-1.0-1.0[206.9, 152.1][-10.695183, -8.859678]
011.988147e6-1.0-1.0[207.0, 151.8][-10.692768, -8.866956]
011.988148e6-1.0-1.0[207.1, 151.7][-10.690352, -8.869381]
011.988149e6-1.0-1.0[207.0, 151.5][-10.692768, -8.874233]
011.98815e6-1.0-1.0[207.0, 151.3][-10.692768, -8.879085]
011.988151e6-1.0-1.0[207.2, 151.4][-10.687937, -8.876659]
011.988152e6-1.0-1.0[207.4, 151.6][-10.683106, -8.871807]
011.988153e6-1.0-1.0[207.6, 151.9][-10.678275, -8.86453]
011.988154e6-1.0-1.0[207.7, 152.1][-10.67586, -8.859678]
011.988155e6-1.0-1.0[207.7, 152.1][-10.67586, -8.859678]
011.988156e6-1.0-1.0[207.7, 152.2][-10.67586, -8.857252]
012.005356e6-1.0-1.0[370.4, 419.0][-6.700617, -2.297363]
012.005357e6-1.0-1.0[371.2, 419.0][-6.680877, -2.297363]
012.005358e6-1.0-1.0[371.1, 418.9][-6.683345, -2.299844]
012.005359e6-1.0-1.0[369.9, 418.7][-6.712953, -2.304806]
012.00536e6-1.0-1.0[368.1, 418.1][-6.75736, -2.319691]
012.005361e6-1.0-1.0[365.9, 417.1][-6.811623, -2.3445]
012.005362e6-1.0-1.0[363.3, 416.3][-6.875737, -2.364346]
012.005363e6-1.0-1.0[361.0, 415.4][-6.932438, -2.386672]
012.005364e6-1.0-1.0[358.0, 414.5][-7.006376, -2.408998]
012.005365e6-1.0-1.0[355.8, 413.8][-7.060582, -2.426362]
012.005366e6-1.0-1.0[353.1, 413.2][-7.12709, -2.441245]
012.005367e6-1.0-1.0[351.2, 412.9][-7.173881, -2.448686]

The processed result has been added as a new column named position to our gaze dataframe.

Additionally we would like to have velocity data available too. We have four different methods available:

  • preceding: this will just take the single preceding sample in account for velocity calculation. Most noisy variant.

  • neighbors: this will take the neighboring samples in account for velocity calculation. A bit less noisy.

  • smooth: this will increase the neighboring samples to two on each side. You can get a smooth conversion this way.

  • savitzky_golay: this is using the Savitzky-Golay differentiation filter for conversion. You can specify additional parameters like window_length and degree. Depending on your parameters this will lead to the best results.

Let’s use the fivepoint method first:

[6]:
dataset.pos2vel(method='fivepoint')

dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 47.41it/s]
[6]:
shape: (17_223, 8)
text_idpage_idtimestimuli_xstimuli_ypixelpositionvelocity
i64i64f64f64f64list[f64]list[f64]list[f64]
011.988145e6-1.0-1.0[206.8, 152.4][-10.697598, -8.852399][null, null]
011.988146e6-1.0-1.0[206.9, 152.1][-10.695183, -8.859678][null, null]
011.988147e6-1.0-1.0[207.0, 151.8][-10.692768, -8.866956][1.610194, -5.256267]
011.988148e6-1.0-1.0[207.1, 151.7][-10.690352, -8.869381][0.402548, -4.447465]
011.988149e6-1.0-1.0[207.0, 151.5][-10.692768, -8.874233][0.402561, -3.234462]
011.98815e6-1.0-1.0[207.0, 151.3][-10.692768, -8.879085][2.012819, -0.808615]
011.988151e6-1.0-1.0[207.2, 151.4][-10.687937, -8.876659][4.025683, 2.83017]
011.988152e6-1.0-1.0[207.4, 151.6][-10.683106, -8.871807][4.428328, 5.256091]
011.988153e6-1.0-1.0[207.6, 151.9][-10.678275, -8.86453][3.220663, 4.851847]
011.988154e6-1.0-1.0[207.7, 152.1][-10.67586, -8.859678][1.610354, 3.234622]
011.988155e6-1.0-1.0[207.7, 152.1][-10.67586, -8.859678][-2.9606e-13, 1.617343]
011.988156e6-1.0-1.0[207.7, 152.2][-10.67586, -8.857252][-0.805187, 1.213025]
012.005356e6-1.0-1.0[370.4, 419.0][-6.700617, -2.297363][30.837758, 1.653971]
012.005357e6-1.0-1.0[371.2, 419.0][-6.680877, -2.297363][7.401726, -1.240481]
012.005358e6-1.0-1.0[371.1, 418.9][-6.683345, -2.299844][-14.803188, -4.961884]
012.005359e6-1.0-1.0[369.9, 418.7][-6.712953, -2.304806][-34.126826, -11.164066]
012.00536e6-1.0-1.0[368.1, 418.1][-6.75736, -2.319691][-48.510256, -17.366038]
012.005361e6-1.0-1.0[365.9, 417.1][-6.811623, -2.3445][-56.310241, -21.086876]
012.005362e6-1.0-1.0[363.3, 416.3][-6.875737, -2.364346][-61.63851, -21.913173]
012.005363e6-1.0-1.0[361.0, 415.4][-6.932438, -2.386672][-63.266374, -21.085616]
012.005364e6-1.0-1.0[358.0, 414.5][-7.006376, -2.408998][-63.249652, -19.431326]
012.005365e6-1.0-1.0[355.8, 413.8][-7.060582, -2.426362][-60.359624, -15.710061]
012.005366e6-1.0-1.0[353.1, 413.2][-7.12709, -2.441245][null, null]
012.005367e6-1.0-1.0[351.2, 412.9][-7.173881, -2.448686][null, null]

The processed result has been added as a new column named velocity to our gaze dataframe.

We can also use the Savitzky-Golay differentiation filter with some additional parameters like this:

[7]:
dataset.pos2vel(method='savitzky_golay', degree=2, window_length=7)

dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 51.80it/s]
[7]:
shape: (17_223, 8)
text_idpage_idtimestimuli_xstimuli_ypixelpositionvelocity
i64i64f64f64f64list[f64]list[f64]list[f64]
011.988145e6-1.0-1.0[206.8, 152.4][-10.697598, -8.852399][1.207641, -3.119165]
011.988146e6-1.0-1.0[206.9, 152.1][-10.695183, -8.859678][1.20764, -4.072198]
011.988147e6-1.0-1.0[207.0, 151.8][-10.692768, -8.866956][1.035119, -4.765267]
011.988148e6-1.0-1.0[207.1, 151.7][-10.690352, -8.869381][1.207654, -4.245382]
011.988149e6-1.0-1.0[207.0, 151.5][-10.692768, -8.874233][1.552735, -2.339263]
011.98815e6-1.0-1.0[207.0, 151.3][-10.692768, -8.879085][2.242885, 0.000009]
011.988151e6-1.0-1.0[207.2, 151.4][-10.687937, -8.876659][2.933036, 1.992718]
011.988152e6-1.0-1.0[207.4, 151.6][-10.683106, -8.871807][3.364372, 3.378942]
011.988153e6-1.0-1.0[207.6, 151.9][-10.678275, -8.86453][2.933062, 3.98543]
011.988154e6-1.0-1.0[207.7, 152.1][-10.67586, -8.859678][1.63908, 3.292347]
011.988155e6-1.0-1.0[207.7, 152.1][-10.67586, -8.859678][0.517608, 2.425984]
011.988156e6-1.0-1.0[207.7, 152.2][-10.67586, -8.857252][-0.25881, 0.953079]
012.005356e6-1.0-1.0[370.4, 419.0][-6.700617, -2.297363][30.127398, 2.215118]
012.005357e6-1.0-1.0[371.2, 419.0][-6.680877, -2.297363][8.10499, -1.772092]
012.005358e6-1.0-1.0[371.1, 418.9][-6.683345, -2.299844][-12.862729, -6.645273]
012.005359e6-1.0-1.0[369.9, 418.7][-6.712953, -2.304806][-30.745214, -11.252527]
012.00536e6-1.0-1.0[368.1, 418.1][-6.75736, -2.319691][-44.219144, -15.593809]
012.005361e6-1.0-1.0[365.9, 417.1][-6.811623, -2.3445][-54.515696, -19.137494]
012.005362e6-1.0-1.0[363.3, 416.3][-6.875737, -2.364346][-59.347609, -20.909048]
012.005363e6-1.0-1.0[361.0, 415.4][-6.932438, -2.386672][-62.062479, -20.465552]
012.005364e6-1.0-1.0[358.0, 414.5][-7.006376, -2.408998][-61.343786, -18.073031]
012.005365e6-1.0-1.0[355.8, 413.8][-7.060582, -2.426362][-53.501231, -14.617634]
012.005366e6-1.0-1.0[353.1, 413.2][-7.12709, -2.441245][-41.879965, -10.276475]
012.005367e6-1.0-1.0[351.2, 412.9][-7.173881, -2.448686][-27.710881, -6.112645]

This has overwritten our velocity columns. As we see, the values in the velocity columns are slightly different.

What you have learned in this tutorial:#

  • transforming pixel coordinates into degrees of visual angle by using Dataset.pix2deg()

  • transforming positional data into velocity data by using Dataset.pos2vel()

  • passing additional keyword arguments when using the Savitzky-Golay differentiation filter