pymovements in 10 minutes#

What you will learn in this tutorial:#

  • how to download one of the publicly available datasets

  • how to load a subset of the data into your memory

  • how to transform pixel coordinates into degrees of visual angle

  • how to transform positional data into velocity data

  • how to detect fixations by using the I-DT algorithm

  • how to detect saccades by using the microsaccades algorithm

  • how to compute additional event properties for your analysis

  • how to save your preprocessed data

  • how to plot the main saccadic sequence from your data

Downloading one of the public datasets#

We import pymovements as the alias pm for convenience.

[1]:
import polars as pl

import pymovements as pm
/home/docs/checkouts/readthedocs.org/user_builds/pymovements/envs/v0.7.0/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

pymovements provides a library of publicly available datasets.

You can browse through the available dataset definitions here: Datasets

For this tutorial we will limit ourselves to the ToyDataset due to its minimal space requirements.

Other datasets can be downloaded by simply replacing ToyDataset with one of the other available datasets.

We can initialize and download the dataset using these simple lines of code:

[2]:
dataset = pm.datasets.ToyDataset(root='data/')
dataset.download()
dataset.extract()
Using already downloaded and verified file: data/ToyDataset/downloads/pymovements-toy-dataset.zip

Loading in your data into memory#

Next we load our dataset into memory to be able to work with it:

[3]:
dataset.load()
100%|██████████| 20/20 [00:00<00:00, 195.67it/s]

This way we fill two attributes with data. First we have the fileinfo attribute which holds all the basic information for files:

[4]:
dataset.fileinfo.head()
[4]:
shape: (5, 3)
text_idpage_idfilepath
i64i64str
01"aeye-lab-pymov...
02"aeye-lab-pymov...
03"aeye-lab-pymov...
04"aeye-lab-pymov...
05"aeye-lab-pymov...

We notice that for each filepath a text_id and page_id is specified.

We have also loaded our gaze data into the dataframes in the gaze attribute:

[5]:
dataset.gaze[0].frame.head()
[5]:
shape: (5, 5)
text_idpage_idtimex_right_pixy_right_pix
i64i64f64f64f64
011.988145e6206.8152.4
011.988146e6206.9152.1
011.988147e6207.0151.8
011.988148e6207.1151.7
011.988149e6207.0151.5

Apart from the familiar columns from the fileinfo dataframe we see the columns time, x_right_pix and y_right_pix.

The last two columns refer to the pixel coordinates at the timestep specified by time.

We are also able to just take a subset of the data by specifying values of the fileinfo columns:

[6]:
subset = {
    'text_id': [1, 2],
    'page_id': 1,
}
dataset.load(subset=subset)

dataset.fileinfo
100%|██████████| 2/2 [00:00<00:00, 200.64it/s]
[6]:
shape: (2, 3)
text_idpage_idfilepath
i64i64str
11"aeye-lab-pymov...
21"aeye-lab-pymov...

Now we selected only a small subset of our data.

Preprocessing raw gaze data#

We now want to preprocess our gaze data by transforming pixel coordinates into degrees of visual angle and then computing velocity data from our positional data.

[7]:
dataset.pix2deg()

dataset.gaze[0].frame.head()
100%|██████████| 2/2 [00:00<00:00, 568.95it/s]
[7]:
shape: (5, 7)
text_idpage_idtimex_right_pixy_right_pixx_right_posy_right_pos
i64i64f64f64f64f64f64
112.415266e6176.8140.2-11.420403-9.148145
112.415267e6176.7139.8-11.422806-9.157834
112.415268e6176.7139.3-11.422806-9.169943
112.415269e6176.6139.3-11.42521-9.169943
112.41527e6176.7139.3-11.422806-9.169943

We notice that two new columns have appeared: x_right_pos and y_right_pos. These are the positional columns specified in degrees of visual angle (dva).

For transforming our positional data into velocity data we will use the Savitzky-Golay differentiation filter.

We can also specify some additional parameters for this method:

[8]:
dataset.pos2vel(method='savitzky_golay', window_length=7, polyorder=2)

dataset.gaze[0].frame.head()
100%|██████████| 2/2 [00:00<00:00, 326.21it/s]
[8]:
shape: (5, 9)
text_idpage_idtimex_right_pixy_right_pixx_right_posy_right_posy_right_velx_right_vel
i64i64f64f64f64f64f64f64f64
112.415266e6176.8140.2-11.420403-9.148145-13.666945-5.235971
112.415267e6176.7139.8-11.422806-9.157834-9.630308-3.004237
112.415268e6176.7139.3-11.422806-9.169943-5.59367-0.772503
112.415269e6176.6139.3-11.42521-9.169943-1.5570321.459231
112.41527e6176.7139.3-11.422806-9.1699431.5569834.034446

Detecting events#

Now let’s detect some events.

First we will detect fixations using the I-VT algorithm using its default parameters:

[9]:
dataset.detect_events(method=pm.events.ivt)

dataset.events[0].frame.head()
2it [00:00, 202.64it/s]
[9]:
shape: (5, 6)
text_idpage_idnameonsetoffsetduration
i64i64stri64i64i64
11"fixation"24153262415524198
11"fixation"24155552415661106
11"fixation"24157122415844132
11"fixation"24158772416024147
11"fixation"24160602416212152

Next we detect some saccades. This time we don’t use the default parameters but specify our own:

[10]:
dataset.detect_events(pm.events.microsaccades, minimum_duration=12)

dataset.events[0].frame.filter(pl.col('name') == 'saccade').head()
2it [00:00, 68.97it/s]
[10]:
shape: (5, 6)
text_idpage_idnameonsetoffsetduration
i64i64stri64i64i64
11"saccade"2415285241530116
11"saccade"2415303241531714
11"saccade"2415524241554117
11"saccade"2415661241567918
11"saccade"2415681241569716

Computing event properties#

The event dataframe currently only holds the name, onset, offset and duration of an event (additionally we have some more identifier columns at the beginning).

We now want to compute some additional properties for each event. Event properties are things like peak velocity, amplitude and dispersion during an event.

We start out with computing the peak velocity:

[11]:
dataset.compute_event_properties("peak_velocity")

dataset.events[0].frame.head()
2it [00:01,  1.44it/s]
[11]:
shape: (5, 7)
text_idpage_idnameonsetoffsetdurationpeak_velocity
i64i64stri64i64i64f64
11"fixation"24153262415524198233.414003
11"fixation"2415555241566110618.852675
11"fixation"2415712241584413264.026588
11"fixation"24158772416024147226.852984
11"fixation"24160602416212152198.758588

We notice that a new column with the name peak_velocity has appeared in the event dataframe.

We can also pass a list of properties. Let’s add the amplitude and dispersion:

[12]:
dataset.compute_event_properties(["amplitude", "dispersion"])

dataset.events[0].frame.head()
2it [00:01,  1.38it/s]
[12]:
shape: (5, 9)
text_idpage_idnameonsetoffsetdurationpeak_velocityamplitudedispersion
i64i64stri64i64i64f64f64f64
11"fixation"24153262415524198233.4140032.7179432.867542
11"fixation"2415555241566110618.8526750.2580320.346841
11"fixation"2415712241584413264.0265880.3274880.392723
11"fixation"24158772416024147226.8529840.2967990.412835
11"fixation"24160602416212152198.7585880.2424530.342738

This way we can compute all of our desired properties in a single run.

Plotting our data#

pymovements provides a range of plotting functions.

You can browse through the available plotting functions here: Plotting

In this this tutorial we will plot the saccadic main sequence of our data.

[13]:
pm.plotting.main_sequence_plot(dataset.events[0])
../_images/tutorials_pymovements-in-10-minutes_36_0.png

Saving and loading your dataframes#

If we want to save interim results we can simply use the save() method like this:

[14]:
dataset.save()
100%|██████████| 2/2 [00:00<00:00, 784.94it/s]
100%|██████████| 2/2 [00:00<00:00, 449.62it/s]

Let’s test this out by initializing a new Dataset object in the same directory and loading in the preprocessed gaze and event data.

This time we don’t need to download anything.

[15]:
preprocessed_dataset = pm.datasets.ToyDataset(root='data/')

dataset.load(events=True, preprocessed=True, subset=subset)

display(dataset.gaze[0])
display(dataset.events[0])
100%|██████████| 2/2 [00:00<00:00, 959.25it/s]
100%|██████████| 2/2 [00:00<00:00, 855.98it/s]
<pymovements.gaze.gaze_dataframe.GazeDataFrame at 0x7f36640c1a00>
<pymovements.events.events.EventDataFrame at 0x7f36640c1ca0>