Preprocessing Raw Gaze Data#

What you will learn in this tutorial:#

  • how to transform pixel coordinates into degrees of visual angle

  • how to transform positional data into velocity data

Preparations#

We import pymovements as the alias pm for convenience.

[1]:
import pymovements as pm
/home/docs/checkouts/readthedocs.org/user_builds/pymovements/envs/v0.19.0/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

Let’s start by downloading our ToyDataset and loading in its data:

[2]:
dataset = pm.Dataset('ToyDataset', path='data/ToyDataset')
dataset.download()
dataset.load()
Using already downloaded and verified file: data/ToyDataset/downloads/pymovements-toy-dataset.zip
Extracting pymovements-toy-dataset.zip to data/ToyDataset/raw
100%|██████████| 20/20 [00:00<00:00, 38.31it/s]
[2]:
<pymovements.dataset.dataset.Dataset at 0x7f9d7cc30430>

We can verify that all files have been loaded in by checking the fileinfo attribute:

[3]:
dataset.fileinfo
[3]:
{'gaze': shape: (20, 3)
 ┌─────────┬─────────┬─────────────────────────────────┐
 │ text_id ┆ page_id ┆ filepath                        │
 │ ---     ┆ ---     ┆ ---                             │
 │ i64     ┆ i64     ┆ str                             │
 ╞═════════╪═════════╪═════════════════════════════════╡
 │ 0       ┆ 1       ┆ aeye-lab-pymovements-toy-datas… │
 │ 0       ┆ 2       ┆ aeye-lab-pymovements-toy-datas… │
 │ 0       ┆ 3       ┆ aeye-lab-pymovements-toy-datas… │
 │ 0       ┆ 4       ┆ aeye-lab-pymovements-toy-datas… │
 │ 0       ┆ 5       ┆ aeye-lab-pymovements-toy-datas… │
 │ …       ┆ …       ┆ …                               │
 │ 3       ┆ 1       ┆ aeye-lab-pymovements-toy-datas… │
 │ 3       ┆ 2       ┆ aeye-lab-pymovements-toy-datas… │
 │ 3       ┆ 3       ┆ aeye-lab-pymovements-toy-datas… │
 │ 3       ┆ 4       ┆ aeye-lab-pymovements-toy-datas… │
 │ 3       ┆ 5       ┆ aeye-lab-pymovements-toy-datas… │
 └─────────┴─────────┴─────────────────────────────────┘}

Now let’s inpect our gaze dataframe:

[4]:
dataset.gaze[0]
[4]:
Experiment(sampling_rate=1000, screen=Screen(width_px=1280, height_px=1024, width_cm=38, height_cm=30.20, distance_cm=68, origin=upper left), eyetracker=None)
shape: (17_223, 6)
┌─────────┬───────────┬───────────┬─────────┬─────────┬────────────────┐
│ time    ┆ stimuli_x ┆ stimuli_y ┆ text_id ┆ page_id ┆ pixel          │
│ ---     ┆ ---       ┆ ---       ┆ ---     ┆ ---     ┆ ---            │
│ i64     ┆ f64       ┆ f64       ┆ i64     ┆ i64     ┆ list[f64]      │
╞═════════╪═══════════╪═══════════╪═════════╪═════════╪════════════════╡
│ 1988145 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [206.8, 152.4] │
│ 1988146 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [206.9, 152.1] │
│ 1988147 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [207.0, 151.8] │
│ 1988148 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [207.1, 151.7] │
│ 1988149 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [207.0, 151.5] │
│ …       ┆ …         ┆ …         ┆ …       ┆ …       ┆ …              │
│ 2005363 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [361.0, 415.4] │
│ 2005364 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [358.0, 414.5] │
│ 2005365 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [355.8, 413.8] │
│ 2005366 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [353.1, 413.2] │
│ 2005367 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [351.2, 412.9] │
└─────────┴───────────┴───────────┴─────────┴─────────┴────────────────┘

Apart from some trial identifier columns we see the columns time and pixel.

Preprocessing#

We now want to transform these pixel position coordinates into coordinates in degrees of visual angle. This is simply done by:

[5]:
dataset.pix2deg()

dataset.gaze[0]
100%|██████████| 20/20 [00:01<00:00, 17.96it/s]
[5]:
Experiment(sampling_rate=1000, screen=Screen(width_px=1280, height_px=1024, width_cm=38, height_cm=30.20, distance_cm=68, origin=upper left), eyetracker=None)
shape: (17_223, 7)
┌─────────┬───────────┬───────────┬─────────┬─────────┬────────────────┬─────────────────────────┐
│ time    ┆ stimuli_x ┆ stimuli_y ┆ text_id ┆ page_id ┆ pixel          ┆ position                │
│ ---     ┆ ---       ┆ ---       ┆ ---     ┆ ---     ┆ ---            ┆ ---                     │
│ i64     ┆ f64       ┆ f64       ┆ i64     ┆ i64     ┆ list[f64]      ┆ list[f64]               │
╞═════════╪═══════════╪═══════════╪═════════╪═════════╪════════════════╪═════════════════════════╡
│ 1988145 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [206.8, 152.4] ┆ [-10.697598, -8.852399] │
│ 1988146 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [206.9, 152.1] ┆ [-10.695183, -8.859678] │
│ 1988147 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [207.0, 151.8] ┆ [-10.692768, -8.866956] │
│ 1988148 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [207.1, 151.7] ┆ [-10.690352, -8.869381] │
│ 1988149 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [207.0, 151.5] ┆ [-10.692768, -8.874233] │
│ …       ┆ …         ┆ …         ┆ …       ┆ …       ┆ …              ┆ …                       │
│ 2005363 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [361.0, 415.4] ┆ [-6.932438, -2.386672]  │
│ 2005364 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [358.0, 414.5] ┆ [-7.006376, -2.408998]  │
│ 2005365 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [355.8, 413.8] ┆ [-7.060582, -2.426362]  │
│ 2005366 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [353.1, 413.2] ┆ [-7.12709, -2.441245]   │
│ 2005367 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [351.2, 412.9] ┆ [-7.173881, -2.448686]  │
└─────────┴───────────┴───────────┴─────────┴─────────┴────────────────┴─────────────────────────┘

The processed result has been added as a new column named position to our gaze dataframe.

Additionally we would like to have velocity data available too. We have four different methods available:

  • preceding: this will just take the single preceding sample in account for velocity calculation. Most noisy variant.

  • neighbors: this will take the neighboring samples in account for velocity calculation. A bit less noisy.

  • smooth: this will increase the neighboring samples to two on each side. You can get a smooth conversion this way.

  • savitzky_golay: this is using the Savitzky-Golay differentiation filter for conversion. You can specify additional parameters like window_length and degree. Depending on your parameters this will lead to the best results.

Let’s use the fivepoint method first:

[6]:
dataset.pos2vel(method='fivepoint')

dataset.gaze[0]
100%|██████████| 20/20 [00:00<00:00, 36.66it/s]
[6]:
Experiment(sampling_rate=1000, screen=Screen(width_px=1280, height_px=1024, width_cm=38, height_cm=30.20, distance_cm=68, origin=upper left), eyetracker=None)
shape: (17_223, 8)
┌─────────┬───────────┬───────────┬─────────┬─────────┬───────────┬────────────────┬───────────────┐
│ time    ┆ stimuli_x ┆ stimuli_y ┆ text_id ┆ page_id ┆ pixel     ┆ position       ┆ velocity      │
│ ---     ┆ ---       ┆ ---       ┆ ---     ┆ ---     ┆ ---       ┆ ---            ┆ ---           │
│ i64     ┆ f64       ┆ f64       ┆ i64     ┆ i64     ┆ list[f64] ┆ list[f64]      ┆ list[f64]     │
╞═════════╪═══════════╪═══════════╪═════════╪═════════╪═══════════╪════════════════╪═══════════════╡
│ 1988145 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [206.8,   ┆ [-10.697598,   ┆ [null, null]  │
│         ┆           ┆           ┆         ┆         ┆ 152.4]    ┆ -8.852399]     ┆               │
│ 1988146 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [206.9,   ┆ [-10.695183,   ┆ [null, null]  │
│         ┆           ┆           ┆         ┆         ┆ 152.1]    ┆ -8.859678]     ┆               │
│ 1988147 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [207.0,   ┆ [-10.692768,   ┆ [1.610194,    │
│         ┆           ┆           ┆         ┆         ┆ 151.8]    ┆ -8.866956]     ┆ -5.256267]    │
│ 1988148 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [207.1,   ┆ [-10.690352,   ┆ [0.402548,    │
│         ┆           ┆           ┆         ┆         ┆ 151.7]    ┆ -8.869381]     ┆ -4.447465]    │
│ 1988149 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [207.0,   ┆ [-10.692768,   ┆ [0.402561,    │
│         ┆           ┆           ┆         ┆         ┆ 151.5]    ┆ -8.874233]     ┆ -3.234462]    │
│ …       ┆ …         ┆ …         ┆ …       ┆ …       ┆ …         ┆ …              ┆ …             │
│ 2005363 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [361.0,   ┆ [-6.932438,    ┆ [-63.266374,  │
│         ┆           ┆           ┆         ┆         ┆ 415.4]    ┆ -2.386672]     ┆ -21.085616]   │
│ 2005364 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [358.0,   ┆ [-7.006376,    ┆ [-63.249652,  │
│         ┆           ┆           ┆         ┆         ┆ 414.5]    ┆ -2.408998]     ┆ -19.431326]   │
│ 2005365 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [355.8,   ┆ [-7.060582,    ┆ [-60.359624,  │
│         ┆           ┆           ┆         ┆         ┆ 413.8]    ┆ -2.426362]     ┆ -15.710061]   │
│ 2005366 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [353.1,   ┆ [-7.12709,     ┆ [null, null]  │
│         ┆           ┆           ┆         ┆         ┆ 413.2]    ┆ -2.441245]     ┆               │
│ 2005367 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [351.2,   ┆ [-7.173881,    ┆ [null, null]  │
│         ┆           ┆           ┆         ┆         ┆ 412.9]    ┆ -2.448686]     ┆               │
└─────────┴───────────┴───────────┴─────────┴─────────┴───────────┴────────────────┴───────────────┘

The processed result has been added as a new column named velocity to our gaze dataframe.

We can also use the Savitzky-Golay differentiation filter with some additional parameters like this:

[7]:
dataset.pos2vel(method='savitzky_golay', degree=2, window_length=7)

dataset.gaze[0]
100%|██████████| 20/20 [00:00<00:00, 34.55it/s]
[7]:
Experiment(sampling_rate=1000, screen=Screen(width_px=1280, height_px=1024, width_cm=38, height_cm=30.20, distance_cm=68, origin=upper left), eyetracker=None)
shape: (17_223, 8)
┌─────────┬───────────┬───────────┬─────────┬─────────┬───────────┬────────────────┬───────────────┐
│ time    ┆ stimuli_x ┆ stimuli_y ┆ text_id ┆ page_id ┆ pixel     ┆ position       ┆ velocity      │
│ ---     ┆ ---       ┆ ---       ┆ ---     ┆ ---     ┆ ---       ┆ ---            ┆ ---           │
│ i64     ┆ f64       ┆ f64       ┆ i64     ┆ i64     ┆ list[f64] ┆ list[f64]      ┆ list[f64]     │
╞═════════╪═══════════╪═══════════╪═════════╪═════════╪═══════════╪════════════════╪═══════════════╡
│ 1988145 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [206.8,   ┆ [-10.697598,   ┆ [1.207641,    │
│         ┆           ┆           ┆         ┆         ┆ 152.4]    ┆ -8.852399]     ┆ -3.119165]    │
│ 1988146 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [206.9,   ┆ [-10.695183,   ┆ [1.20764,     │
│         ┆           ┆           ┆         ┆         ┆ 152.1]    ┆ -8.859678]     ┆ -4.072198]    │
│ 1988147 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [207.0,   ┆ [-10.692768,   ┆ [1.035119,    │
│         ┆           ┆           ┆         ┆         ┆ 151.8]    ┆ -8.866956]     ┆ -4.765267]    │
│ 1988148 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [207.1,   ┆ [-10.690352,   ┆ [1.207654,    │
│         ┆           ┆           ┆         ┆         ┆ 151.7]    ┆ -8.869381]     ┆ -4.245382]    │
│ 1988149 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [207.0,   ┆ [-10.692768,   ┆ [1.552735,    │
│         ┆           ┆           ┆         ┆         ┆ 151.5]    ┆ -8.874233]     ┆ -2.339263]    │
│ …       ┆ …         ┆ …         ┆ …       ┆ …       ┆ …         ┆ …              ┆ …             │
│ 2005363 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [361.0,   ┆ [-6.932438,    ┆ [-62.062479,  │
│         ┆           ┆           ┆         ┆         ┆ 415.4]    ┆ -2.386672]     ┆ -20.465552]   │
│ 2005364 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [358.0,   ┆ [-7.006376,    ┆ [-61.343786,  │
│         ┆           ┆           ┆         ┆         ┆ 414.5]    ┆ -2.408998]     ┆ -18.073031]   │
│ 2005365 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [355.8,   ┆ [-7.060582,    ┆ [-53.501231,  │
│         ┆           ┆           ┆         ┆         ┆ 413.8]    ┆ -2.426362]     ┆ -14.617634]   │
│ 2005366 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [353.1,   ┆ [-7.12709,     ┆ [-41.879965,  │
│         ┆           ┆           ┆         ┆         ┆ 413.2]    ┆ -2.441245]     ┆ -10.276475]   │
│ 2005367 ┆ -1.0      ┆ -1.0      ┆ 0       ┆ 1       ┆ [351.2,   ┆ [-7.173881,    ┆ [-27.710881,  │
│         ┆           ┆           ┆         ┆         ┆ 412.9]    ┆ -2.448686]     ┆ -6.112645]    │
└─────────┴───────────┴───────────┴─────────┴─────────┴───────────┴────────────────┴───────────────┘

This has overwritten our velocity columns. As we see, the values in the velocity columns are slightly different.

What you have learned in this tutorial:#

  • transforming pixel coordinates into degrees of visual angle by using Dataset.pix2deg()

  • transforming positional data into velocity data by using Dataset.pos2vel()

  • passing additional keyword arguments when using the Savitzky-Golay differentiation filter