Preprocessing Raw Gaze Data#

What you will learn in this tutorial:#

  • how to transform pixel coordinates into degrees of visual angle

  • how to transform positional data into velocity data

Preparations#

We import pymovements as the alias pm for convenience.

[1]:
import pymovements as pm
/home/docs/checkouts/readthedocs.org/user_builds/pymovements/envs/v0.10.0/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

Let’s start by downloading our ToyDataset and loading in its data:

[2]:
dataset = pm.Dataset('ToyDataset', path='data/ToyDataset')
dataset.download()
dataset.load()
Using already downloaded and verified file: data/ToyDataset/downloads/pymovements-toy-dataset.zip
100%|██████████| 20/20 [00:00<00:00, 112.46it/s]
[2]:
<pymovements.dataset.dataset.Dataset at 0x7fa315f1f7f0>

We can verify that all files have been loaded in by checking the fileinfo attribute:

[3]:
dataset.fileinfo
[3]:
shape: (20, 3)
text_idpage_idfilepath
i64i64str
01"aeye-lab-pymov…
02"aeye-lab-pymov…
03"aeye-lab-pymov…
04"aeye-lab-pymov…
05"aeye-lab-pymov…
11"aeye-lab-pymov…
12"aeye-lab-pymov…
13"aeye-lab-pymov…
14"aeye-lab-pymov…
15"aeye-lab-pymov…
21"aeye-lab-pymov…
22"aeye-lab-pymov…
23"aeye-lab-pymov…
24"aeye-lab-pymov…
25"aeye-lab-pymov…
31"aeye-lab-pymov…
32"aeye-lab-pymov…
33"aeye-lab-pymov…
34"aeye-lab-pymov…
35"aeye-lab-pymov…

Now let’s inpect our gaze dataframe:

[4]:
dataset.gaze[0].frame.head()
[4]:
shape: (5, 5)
text_idpage_idtimex_right_pixy_right_pix
i64i64f64f64f64
011.988145e6206.8152.4
011.988146e6206.9152.1
011.988147e6207.0151.8
011.988148e6207.1151.7
011.988149e6207.0151.5

Apart from some additional labels we see the following columns: time, x_right_pix and y_right_pix.

Preprocessing#

We now want to transform these pixel position coordinates into coordinates in degrees of visual angle. This is simply done by:

[5]:
dataset.pix2deg()

dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 501.22it/s]
[5]:
shape: (17_223, 7)
text_idpage_idtimex_right_pixy_right_pixy_right_posx_right_pos
i64i64f64f64f64f64f64
011.988145e6206.8152.4-12.005591-7.528075
011.988146e6206.9152.1-12.01277-7.525633
011.988147e6207.0151.8-12.019949-7.52319
011.988148e6207.1151.7-12.022342-7.520748
011.988149e6207.0151.5-12.027128-7.52319
011.98815e6207.0151.3-12.031913-7.52319
011.988151e6207.2151.4-12.02952-7.518305
011.988152e6207.4151.6-12.024735-7.513421
011.988153e6207.6151.9-12.017556-7.508536
011.988154e6207.7152.1-12.01277-7.506093
011.988155e6207.7152.1-12.01277-7.506093
011.988156e6207.7152.2-12.010377-7.506093
012.005356e6370.4419.0-5.498696-3.501922
012.005357e6371.2419.0-5.498696-3.482116
012.005358e6371.1418.9-5.501175-3.484592
012.005359e6369.9418.7-5.506132-3.5143
012.00536e6368.1418.1-5.521002-3.558859
012.005361e6365.9417.1-5.545783-3.613315
012.005362e6363.3416.3-5.565607-3.677663
012.005363e6361.0415.4-5.587907-3.734578
012.005364e6358.0414.5-5.610206-3.808805
012.005365e6355.8413.8-5.627548-3.863229
012.005366e6353.1413.2-5.642412-3.930014
012.005367e6351.2412.9-5.649843-3.977003

The processed result has been added as new columns to our gaze dataframe: x_right_pos, y_right_pos.

Additionally we would like to have velocity data available too. We have four different methods available:

  • preceding: this will just take the single preceding sample in account for velocity calculation. Most noisy variant.

  • neighbors: this will take the neighboring samples in account for velocity calculation. A bit less noisy.

  • smooth: this will increase the neighboring samples to two on each side. You can get a smooth conversion this way.

  • savitzky_golay: this is using the Savitzky-Golay differentiation filter for conversion. You can specify additional parameters like window_length and polyorder. Depending on your parameters this will lead to the best results.

Let’s use the smooth method first:

[6]:
dataset.pos2vel(method='smooth')

dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 448.13it/s]
[6]:
shape: (17_223, 9)
text_idpage_idtimex_right_pixy_right_pixy_right_posx_right_posx_right_vely_right_vel
i64i64f64f64f64f64f64f64f64
011.988145e6206.8152.4-12.005591-7.5280751.221164-3.589697
011.988146e6206.9152.1-12.01277-7.5256332.442343-7.179203
011.988147e6207.0151.8-12.019949-7.523191.628238-5.184827
011.988148e6207.1151.7-12.022342-7.5207480.407059-4.386968
011.988149e6207.0151.5-12.027128-7.523190.407069-3.190445
011.98815e6207.0151.3-12.031913-7.523192.035352-0.797611
011.988151e6207.2151.4-12.02952-7.5183054.0707362.79166
011.988152e6207.4151.6-12.024735-7.5134214.4778645.184593
011.988153e6207.6151.9-12.017556-7.5085363.2566724.785873
011.988154e6207.7152.1-12.01277-7.5060931.6283523.190657
011.988155e6207.7152.1-12.01277-7.5060930.01.595371
011.988156e6207.7152.2-12.010377-7.506093-0.8141831.196552
012.005356e6370.4419.0-5.498696-3.50192230.9439091.652276
012.005357e6371.2419.0-5.498696-3.4821167.426888-1.239212
012.005358e6371.1418.9-5.501175-3.484592-14.853638-4.956757
012.005359e6369.9418.7-5.506132-3.5143-34.244438-11.152291
012.00536e6368.1418.1-5.521002-3.558859-48.680946-17.347327
012.005361e6365.9417.1-5.545783-3.613315-56.513558-21.063531
012.005362e6363.3416.3-5.565607-3.677663-61.868102-21.888043
012.005363e6361.0415.4-5.587907-3.734578-63.509382-21.060565
012.005364e6358.0414.5-5.610206-3.808805-63.500293-19.40755
012.005365e6355.8413.8-5.627548-3.863229-60.605681-15.690342
012.005366e6353.1413.2-5.642412-3.930014-56.887102-11.147739
012.005367e6351.2412.9-5.649843-3.977003-23.494968-3.715818

This added the following new velocity columns to our gaze dataframe: x_right_vel, y_right_vel

We can also use the Savitzky-Golay differentiation filter with some additional parameters like this:

[7]:
dataset.pos2vel(method='savitzky_golay', window_length=7, polyorder=2)

dataset.gaze[0].frame
100%|██████████| 20/20 [00:00<00:00, 278.40it/s]
[7]:
shape: (17_223, 9)
text_idpage_idtimex_right_pixy_right_pixy_right_posx_right_posx_right_vely_right_vel
i64i64f64f64f64f64f64f64f64
011.988145e6206.8152.4-12.005591-7.5280751.918969-8.119266
011.988146e6206.9152.1-12.01277-7.5256331.686374-6.80873
011.988147e6207.0151.8-12.019949-7.523191.453779-5.498195
011.988148e6207.1151.7-12.022342-7.5207481.221184-4.187659
011.988149e6207.0151.5-12.027128-7.523191.570121-2.307447
011.98815e6207.0151.3-12.031913-7.523192.2679850.000012
011.988151e6207.2151.4-12.02952-7.5183052.9658491.965619
011.988152e6207.4151.6-12.024735-7.5134213.4020093.332988
011.988153e6207.6151.9-12.017556-7.5085362.9658683.931231
011.988154e6207.7152.1-12.01277-7.5060931.6574083.247586
011.988155e6207.7152.1-12.01277-7.5060930.5233942.393016
011.988156e6207.7152.2-12.010377-7.506093-0.2617020.940131
012.005356e6370.4419.0-5.498696-3.50192230.2337062.212816
012.005357e6371.2419.0-5.498696-3.4821168.133334-1.770247
012.005358e6371.1418.9-5.501175-3.484592-12.907492-6.638259
012.005359e6369.9418.7-5.506132-3.5143-30.853149-11.240462
012.00536e6368.1418.1-5.521002-3.558859-44.376559-15.576756
012.005361e6365.9417.1-5.545783-3.613315-54.71422-19.116071
012.005362e6363.3416.3-5.565607-3.677663-59.569318-20.885051
012.005363e6361.0415.4-5.587907-3.734578-62.301177-20.441377
012.005364e6358.0414.5-5.610206-3.808805-61.58637-18.051086
012.005365e6355.8413.8-5.627548-3.863229-59.751855-15.454523
012.005366e6353.1413.2-5.642412-3.930014-57.917341-12.857961
012.005367e6351.2412.9-5.649843-3.977003-56.082826-10.261398

This has overwritten our velocity columns. As we see, the values in the velocity columns are slightly different.

What you have learned in this tutorial:#

  • transforming pixel coordinates into degrees of visual angle by using Dataset.pix2deg()

  • transforming positional data into velocity data by using Dataset.pos2vel()

  • passing additional keyword arguments when using the Savitzky-Golay differentiation filter