Preprocessing Raw Gaze Data#

What you will learn in this tutorial:#

  • how to transform pixel coordinates into degrees of visual angle

  • how to transform positional data into velocity data

Preparations#

We import pymovements as the alias pm for convenience.

[1]:
import pymovements as pm
/home/docs/checkouts/readthedocs.org/user_builds/pymovements/envs/v0.12.0/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

Let’s start by downloading our ToyDataset and loading in its data:

[2]:
dataset = pm.Dataset('ToyDataset', path='data/ToyDataset')
dataset.download()
dataset.load()
Using already downloaded and verified file: data/ToyDataset/downloads/pymovements-toy-dataset.zip
Extracting pymovements-toy-dataset.zip to data/ToyDataset/raw
100%|██████████| 20/20 [00:00<00:00, 177.66it/s]
[2]:
<pymovements.dataset.dataset.Dataset at 0x7fb4fdac6a60>

We can verify that all files have been loaded in by checking the fileinfo attribute:

[3]:
dataset.fileinfo
[3]:
shape: (20, 3)
text_idpage_idfilepath
i64i64str
01"aeye-lab-pymov…
02"aeye-lab-pymov…
03"aeye-lab-pymov…
04"aeye-lab-pymov…
05"aeye-lab-pymov…
11"aeye-lab-pymov…
12"aeye-lab-pymov…
13"aeye-lab-pymov…
14"aeye-lab-pymov…
15"aeye-lab-pymov…
21"aeye-lab-pymov…
22"aeye-lab-pymov…
23"aeye-lab-pymov…
24"aeye-lab-pymov…
25"aeye-lab-pymov…
31"aeye-lab-pymov…
32"aeye-lab-pymov…
33"aeye-lab-pymov…
34"aeye-lab-pymov…
35"aeye-lab-pymov…

Now let’s inpect our gaze dataframe:

[4]:
dataset.gaze[0].frame.head()
[4]:
shape: (5, 5)
text_idpage_idtimex_right_pixy_right_pix
i64i64f64f64f64
011.988145e6206.8152.4
011.988146e6206.9152.1
011.988147e6207.0151.8
011.988148e6207.1151.7
011.988149e6207.0151.5

Apart from some additional labels we see the following columns: time, x_right_pix and y_right_pix.

Preprocessing#

We now want to transform these pixel position coordinates into coordinates in degrees of visual angle. This is simply done by:

[5]:
dataset.pix2deg()

dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 713.83it/s]
[5]:
shape: (17_223, 7)
text_idpage_idtimex_right_pixy_right_pixy_right_posx_right_pos
i64i64f64f64f64f64f64
011.988145e6206.8152.4-12.005591-7.528075
011.988146e6206.9152.1-12.01277-7.525633
011.988147e6207.0151.8-12.019949-7.52319
011.988148e6207.1151.7-12.022342-7.520748
011.988149e6207.0151.5-12.027128-7.52319
011.98815e6207.0151.3-12.031913-7.52319
011.988151e6207.2151.4-12.02952-7.518305
011.988152e6207.4151.6-12.024735-7.513421
011.988153e6207.6151.9-12.017556-7.508536
011.988154e6207.7152.1-12.01277-7.506093
011.988155e6207.7152.1-12.01277-7.506093
011.988156e6207.7152.2-12.010377-7.506093
012.005356e6370.4419.0-5.498696-3.501922
012.005357e6371.2419.0-5.498696-3.482116
012.005358e6371.1418.9-5.501175-3.484592
012.005359e6369.9418.7-5.506132-3.5143
012.00536e6368.1418.1-5.521002-3.558859
012.005361e6365.9417.1-5.545783-3.613315
012.005362e6363.3416.3-5.565607-3.677663
012.005363e6361.0415.4-5.587907-3.734578
012.005364e6358.0414.5-5.610206-3.808805
012.005365e6355.8413.8-5.627548-3.863229
012.005366e6353.1413.2-5.642412-3.930014
012.005367e6351.2412.9-5.649843-3.977003

The processed result has been added as new columns to our gaze dataframe: x_right_pos, y_right_pos.

Additionally we would like to have velocity data available too. We have four different methods available:

  • preceding: this will just take the single preceding sample in account for velocity calculation. Most noisy variant.

  • neighbors: this will take the neighboring samples in account for velocity calculation. A bit less noisy.

  • smooth: this will increase the neighboring samples to two on each side. You can get a smooth conversion this way.

  • savitzky_golay: this is using the Savitzky-Golay differentiation filter for conversion. You can specify additional parameters like window_length and polyorder. Depending on your parameters this will lead to the best results.

Let’s use the smooth method first:

[6]:
dataset.pos2vel(method='smooth')

dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 665.00it/s]
[6]:
shape: (17_223, 9)
text_idpage_idtimex_right_pixy_right_pixy_right_posx_right_posy_right_velx_right_vel
i64i64f64f64f64f64f64f64f64
011.988145e6206.8152.4-12.005591-7.528075-3.5896971.221164
011.988146e6206.9152.1-12.01277-7.525633-7.1792032.442343
011.988147e6207.0151.8-12.019949-7.52319-5.1848271.628238
011.988148e6207.1151.7-12.022342-7.520748-4.3869680.407059
011.988149e6207.0151.5-12.027128-7.52319-3.1904450.407069
011.98815e6207.0151.3-12.031913-7.52319-0.7976112.035352
011.988151e6207.2151.4-12.02952-7.5183052.791664.070736
011.988152e6207.4151.6-12.024735-7.5134215.1845934.477864
011.988153e6207.6151.9-12.017556-7.5085364.7858733.256672
011.988154e6207.7152.1-12.01277-7.5060933.1906571.628352
011.988155e6207.7152.1-12.01277-7.5060931.5953710.0
011.988156e6207.7152.2-12.010377-7.5060931.196552-0.814183
012.005356e6370.4419.0-5.498696-3.5019221.65227630.943909
012.005357e6371.2419.0-5.498696-3.482116-1.2392127.426888
012.005358e6371.1418.9-5.501175-3.484592-4.956757-14.853638
012.005359e6369.9418.7-5.506132-3.5143-11.152291-34.244438
012.00536e6368.1418.1-5.521002-3.558859-17.347327-48.680946
012.005361e6365.9417.1-5.545783-3.613315-21.063531-56.513558
012.005362e6363.3416.3-5.565607-3.677663-21.888043-61.868102
012.005363e6361.0415.4-5.587907-3.734578-21.060565-63.509382
012.005364e6358.0414.5-5.610206-3.808805-19.40755-63.500293
012.005365e6355.8413.8-5.627548-3.863229-15.690342-60.605681
012.005366e6353.1413.2-5.642412-3.930014-11.147739-56.887102
012.005367e6351.2412.9-5.649843-3.977003-3.715818-23.494968

This added the following new velocity columns to our gaze dataframe: x_right_vel, y_right_vel

We can also use the Savitzky-Golay differentiation filter with some additional parameters like this:

[7]:
dataset.pos2vel(method='savitzky_golay', window_length=7, polyorder=2)

dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 415.38it/s]
[7]:
shape: (17_223, 9)
text_idpage_idtimex_right_pixy_right_pixy_right_posx_right_posy_right_velx_right_vel
i64i64f64f64f64f64f64f64f64
011.988145e6206.8152.4-12.005591-7.528075-8.1192661.918969
011.988146e6206.9152.1-12.01277-7.525633-6.808731.686374
011.988147e6207.0151.8-12.019949-7.52319-5.4981951.453779
011.988148e6207.1151.7-12.022342-7.520748-4.1876591.221184
011.988149e6207.0151.5-12.027128-7.52319-2.3074471.570121
011.98815e6207.0151.3-12.031913-7.523190.0000122.267985
011.988151e6207.2151.4-12.02952-7.5183051.9656192.965849
011.988152e6207.4151.6-12.024735-7.5134213.3329883.402009
011.988153e6207.6151.9-12.017556-7.5085363.9312312.965868
011.988154e6207.7152.1-12.01277-7.5060933.2475861.657408
011.988155e6207.7152.1-12.01277-7.5060932.3930160.523394
011.988156e6207.7152.2-12.010377-7.5060930.940131-0.261702
012.005356e6370.4419.0-5.498696-3.5019222.21281630.233706
012.005357e6371.2419.0-5.498696-3.482116-1.7702478.133334
012.005358e6371.1418.9-5.501175-3.484592-6.638259-12.907492
012.005359e6369.9418.7-5.506132-3.5143-11.240462-30.853149
012.00536e6368.1418.1-5.521002-3.558859-15.576756-44.376559
012.005361e6365.9417.1-5.545783-3.613315-19.116071-54.71422
012.005362e6363.3416.3-5.565607-3.677663-20.885051-59.569318
012.005363e6361.0415.4-5.587907-3.734578-20.441377-62.301177
012.005364e6358.0414.5-5.610206-3.808805-18.051086-61.58637
012.005365e6355.8413.8-5.627548-3.863229-15.454523-59.751855
012.005366e6353.1413.2-5.642412-3.930014-12.857961-57.917341
012.005367e6351.2412.9-5.649843-3.977003-10.261398-56.082826

This has overwritten our velocity columns. As we see, the values in the velocity columns are slightly different.

What you have learned in this tutorial:#

  • transforming pixel coordinates into degrees of visual angle by using Dataset.pix2deg()

  • transforming positional data into velocity data by using Dataset.pos2vel()

  • passing additional keyword arguments when using the Savitzky-Golay differentiation filter