Preprocessing Raw Gaze Data#

What you will learn in this tutorial:#

  • how to transform pixel coordinates into degrees of visual angle

  • how to transform positional data into velocity data

Preparations#

We import pymovements as the alias pm for convenience.

[1]:
import pymovements as pm
/home/docs/checkouts/readthedocs.org/user_builds/pymovements/envs/stable/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

Let’s start by downloading our ToyDataset and loading in its data:

[2]:
dataset = pm.Dataset('ToyDataset', path='data/ToyDataset')
dataset.download()
dataset.load()
Using already downloaded and verified file: data/ToyDataset/downloads/pymovements-toy-dataset.zip
Extracting pymovements-toy-dataset.zip to data/ToyDataset/raw
100%|██████████| 20/20 [00:00<00:00, 20.53it/s]
[2]:
<pymovements.dataset.dataset.Dataset at 0x7f4e8cabee80>

We can verify that all files have been loaded in by checking the fileinfo attribute:

[3]:
dataset.fileinfo
[3]:
shape: (20, 3)
text_idpage_idfilepath
i64i64str
01"aeye-lab-pymov…
02"aeye-lab-pymov…
03"aeye-lab-pymov…
04"aeye-lab-pymov…
05"aeye-lab-pymov…
11"aeye-lab-pymov…
12"aeye-lab-pymov…
13"aeye-lab-pymov…
14"aeye-lab-pymov…
15"aeye-lab-pymov…
21"aeye-lab-pymov…
22"aeye-lab-pymov…
23"aeye-lab-pymov…
24"aeye-lab-pymov…
25"aeye-lab-pymov…
31"aeye-lab-pymov…
32"aeye-lab-pymov…
33"aeye-lab-pymov…
34"aeye-lab-pymov…
35"aeye-lab-pymov…

Now let’s inpect our gaze dataframe:

[4]:
dataset.gaze[0].frame.head()
[4]:
shape: (5, 6)
timestimuli_xstimuli_ytext_idpage_idpixel
f32f32f32i64i64list[f32]
1.988145e6-1.0-1.001[206.800003, 152.399994]
1.988146e6-1.0-1.001[206.899994, 152.100006]
1.988147e6-1.0-1.001[207.0, 151.800003]
1.988148e6-1.0-1.001[207.100006, 151.699997]
1.988149e6-1.0-1.001[207.0, 151.5]

Apart from some trial identifier columns we see the columns time and pixel.

Preprocessing#

We now want to transform these pixel position coordinates into coordinates in degrees of visual angle. This is simply done by:

[5]:
dataset.pix2deg()

dataset.gaze[0].frame
100%|██████████| 20/20 [00:01<00:00, 10.48it/s]
[5]:
shape: (17_223, 7)
timestimuli_xstimuli_ytext_idpage_idpixelposition
f32f32f32i64i64list[f32]list[f32]
1.988145e6-1.0-1.001[206.800003, 152.399994][-10.697598, -8.8524]
1.988146e6-1.0-1.001[206.899994, 152.100006][-10.695184, -8.859678]
1.988147e6-1.0-1.001[207.0, 151.800003][-10.692768, -8.866957]
1.988148e6-1.0-1.001[207.100006, 151.699997][-10.690351, -8.869382]
1.988149e6-1.0-1.001[207.0, 151.5][-10.692768, -8.874233]
1.98815e6-1.0-1.001[207.0, 151.300003][-10.692768, -8.879086]
1.988151e6-1.0-1.001[207.199997, 151.399994][-10.687937, -8.87666]
1.988152e6-1.0-1.001[207.399994, 151.600006][-10.683106, -8.871807]
1.988153e6-1.0-1.001[207.600006, 151.899994][-10.678275, -8.864531]
1.988154e6-1.0-1.001[207.699997, 152.100006][-10.675859, -8.859678]
1.988155e6-1.0-1.001[207.699997, 152.100006][-10.675859, -8.859678]
1.988156e6-1.0-1.001[207.699997, 152.199997][-10.675859, -8.857252]
2.005356e6-1.0-1.001[370.399994, 419.0][-6.700617, -2.297363]
2.005357e6-1.0-1.001[371.200012, 419.0][-6.680877, -2.297363]
2.005358e6-1.0-1.001[371.100006, 418.899994][-6.683344, -2.299844]
2.005359e6-1.0-1.001[369.899994, 418.700012][-6.712954, -2.304806]
2.00536e6-1.0-1.001[368.100006, 418.100006][-6.75736, -2.319691]
2.005361e6-1.0-1.001[365.899994, 417.100006][-6.811623, -2.3445]
2.005362e6-1.0-1.001[363.299988, 416.299988][-6.875737, -2.364347]
2.005363e6-1.0-1.001[361.0, 415.399994][-6.932438, -2.386672]
2.005364e6-1.0-1.001[358.0, 414.5][-7.006376, -2.408998]
2.005365e6-1.0-1.001[355.799988, 413.799988][-7.060582, -2.426362]
2.005366e6-1.0-1.001[353.100006, 413.200012][-7.12709, -2.441245]
2.005367e6-1.0-1.001[351.200012, 412.899994][-7.173881, -2.448686]

The processed result has been added as a new column named position to our gaze dataframe.

Additionally we would like to have velocity data available too. We have four different methods available:

  • preceding: this will just take the single preceding sample in account for velocity calculation. Most noisy variant.

  • neighbors: this will take the neighboring samples in account for velocity calculation. A bit less noisy.

  • smooth: this will increase the neighboring samples to two on each side. You can get a smooth conversion this way.

  • savitzky_golay: this is using the Savitzky-Golay differentiation filter for conversion. You can specify additional parameters like window_length and degree. Depending on your parameters this will lead to the best results.

Let’s use the fivepoint method first:

[6]:
dataset.pos2vel(method='fivepoint')

dataset.gaze[0].frame
100%|██████████| 20/20 [00:01<00:00, 19.40it/s]
[6]:
shape: (17_223, 8)
timestimuli_xstimuli_ytext_idpage_idpixelpositionvelocity
f32f32f32i64i64list[f32]list[f32]list[f32]
1.988145e6-1.0-1.001[206.800003, 152.399994][-10.697598, -8.8524][null, null]
1.988146e6-1.0-1.001[206.899994, 152.100006][-10.695184, -8.859678][null, null]
1.988147e6-1.0-1.001[207.0, 151.800003][-10.692768, -8.866957][1.610438, -5.256017]
1.988148e6-1.0-1.001[207.100006, 151.699997][-10.690351, -8.869382][0.40261, -4.447301]
1.988149e6-1.0-1.001[207.0, 151.5][-10.692768, -8.874233][0.402451, -3.234386]
1.98815e6-1.0-1.001[207.0, 151.300003][-10.692768, -8.879086][2.012571, -0.808557]
1.988151e6-1.0-1.001[207.199997, 151.399994][-10.687937, -8.87666][4.025777, 2.830188]
1.988152e6-1.0-1.001[207.399994, 151.600006][-10.683106, -8.871807][4.428546, 5.256176]
1.988153e6-1.0-1.001[207.600006, 151.899994][-10.678275, -8.864531][3.220717, 4.851818]
1.988154e6-1.0-1.001[207.699997, 152.100006][-10.675859, -8.859678][1.610438, 3.234545]
1.988155e6-1.0-1.001[207.699997, 152.100006][-10.675859, -8.859678][0.000159, 1.617432]
1.988156e6-1.0-1.001[207.699997, 152.199997][-10.675859, -8.857252][-0.805219, 1.213074]
2.005356e6-1.0-1.001[370.399994, 419.0][-6.700617, -2.297363][30.837774, 1.65391]
2.005357e6-1.0-1.001[371.200012, 419.0][-6.680877, -2.297363][7.401864, -1.240412]
2.005358e6-1.0-1.001[371.100006, 418.899994][-6.683344, -2.299844][-14.803171, -4.961729]
2.005359e6-1.0-1.001[369.899994, 418.700012][-6.712954, -2.304806][-34.126919, -11.16403]
2.00536e6-1.0-1.001[368.100006, 418.100006][-6.75736, -2.319691][-48.510315, -17.366093]
2.005361e6-1.0-1.001[365.899994, 417.100006][-6.811623, -2.3445][-56.310177, -21.087051]
2.005362e6-1.0-1.001[363.299988, 416.299988][-6.875737, -2.364347][-61.638596, -21.913252]
2.005363e6-1.0-1.001[361.0, 415.399994][-6.932438, -2.386672][-63.266281, -21.085701]
2.005364e6-1.0-1.001[358.0, 414.5][-7.006376, -2.408998][-63.249668, -19.431353]
2.005365e6-1.0-1.001[355.799988, 413.799988][-7.060582, -2.426362][-60.359718, -15.709997]
2.005366e6-1.0-1.001[353.100006, 413.200012][-7.12709, -2.441245][null, null]
2.005367e6-1.0-1.001[351.200012, 412.899994][-7.173881, -2.448686][null, null]

The processed result has been added as a new column named velocity to our gaze dataframe.

We can also use the Savitzky-Golay differentiation filter with some additional parameters like this:

[7]:
dataset.pos2vel(method='savitzky_golay', degree=2, window_length=7)

dataset.gaze[0].frame
100%|██████████| 20/20 [00:01<00:00, 18.55it/s]
[7]:
shape: (17_223, 8)
timestimuli_xstimuli_ytext_idpage_idpixelpositionvelocity
f32f32f32i64i64list[f32]list[f32]list[f32]
1.988145e6-1.0-1.001[206.800003, 152.399994][-10.697598, -8.8524][1.207726, -3.11923]
1.988146e6-1.0-1.001[206.899994, 152.100006][-10.695184, -8.859678][1.207692, -4.072189]
1.988147e6-1.0-1.001[207.0, 151.800003][-10.692768, -8.866957][1.035145, -4.765272]
1.988148e6-1.0-1.001[207.100006, 151.699997][-10.690351, -8.869382][1.207726, -4.245451]
1.988149e6-1.0-1.001[207.0, 151.5][-10.692768, -8.874233][1.552786, -2.339193]
1.98815e6-1.0-1.001[207.0, 151.300003][-10.692768, -8.879086][2.242872, 0.000034]
1.988151e6-1.0-1.001[207.199997, 151.399994][-10.687937, -8.87666][2.932991, 1.992668]
1.988152e6-1.0-1.001[207.399994, 151.600006][-10.683106, -8.871807][3.364461, 3.378902]
1.988153e6-1.0-1.001[207.600006, 151.899994][-10.678275, -8.864531][2.933128, 3.985473]
1.988154e6-1.0-1.001[207.699997, 152.100006][-10.675859, -8.859678][1.639094, 3.29239]
1.988155e6-1.0-1.001[207.699997, 152.100006][-10.675859, -8.859678][0.517641, 2.425943]
1.988156e6-1.0-1.001[207.699997, 152.199997][-10.675859, -8.857252][-0.25882, 0.953129]
2.005356e6-1.0-1.001[370.399994, 419.0][-6.700617, -2.297363][30.127287, 2.215104]
2.005357e6-1.0-1.001[371.200012, 419.0][-6.680877, -2.297363][8.104988, -1.772072]
2.005358e6-1.0-1.001[371.100006, 418.899994][-6.683344, -2.299844][-12.8627, -6.64522]
2.005359e6-1.0-1.001[369.899994, 418.700012][-6.712954, -2.304806][-30.745234, -11.25254]
2.00536e6-1.0-1.001[368.100006, 418.100006][-6.75736, -2.319691][-44.219154, -15.593843]
2.005361e6-1.0-1.001[365.899994, 417.100006][-6.811623, -2.3445][-54.51572, -19.137569]
2.005362e6-1.0-1.001[363.299988, 416.299988][-6.875737, -2.364347][-59.347614, -20.909182]
2.005363e6-1.0-1.001[361.0, 415.399994][-6.932438, -2.386672][-62.0625, -20.465605]
2.005364e6-1.0-1.001[358.0, 414.5][-7.006376, -2.408998][-61.343773, -18.07303]
2.005365e6-1.0-1.001[355.799988, 413.799988][-7.060582, -2.426362][-53.501213, -14.617588]
2.005366e6-1.0-1.001[353.100006, 413.200012][-7.12709, -2.441245][-41.879959, -10.276445]
2.005367e6-1.0-1.001[351.200012, 412.899994][-7.173881, -2.448686][-27.710863, -6.112601]

This has overwritten our velocity columns. As we see, the values in the velocity columns are slightly different.

What you have learned in this tutorial:#

  • transforming pixel coordinates into degrees of visual angle by using Dataset.pix2deg()

  • transforming positional data into velocity data by using Dataset.pos2vel()

  • passing additional keyword arguments when using the Savitzky-Golay differentiation filter