Saving and Loading Preprocessed Data#

What you will learn in this tutorial:#

  • how to save your preprocessed data

  • how to load your preprocessed data

Preparations#

We import pymovements as the alias pm for convenience.

import pymovements as pm

Let’s start by downloading our ToyDataset and loading in its data:

dataset = pm.Dataset('ToyDataset', path='data/ToyDataset')
dataset.download()
dataset.load()
INFO:pymovements.dataset.dataset:
        You are downloading the pymovements Toy Dataset. Please be aware that pymovements does not
        host or distribute any dataset resources and only provides a convenient interface to
        download the public dataset resources that were published by their respective authors.

        Please cite the referenced publication if you intend to use the dataset in your research.
        
Using already downloaded and verified file: data/ToyDataset/downloads/pymovements-toy-dataset.zip
Extracting pymovements-toy-dataset.zip to data/ToyDataset/raw
Extracting archive:   0%|          | 0/23 [00:00<?, ?file/s]
Extracting archive: 100%|██████████| 23/23 [00:00<00:00, 368.43file/s]

Dataset
  • DatasetDefinition
    DatasetDefinition
    • 'ToyDataset'
    • 'pymovements Toy Dataset'
    • 'Example toy dataset. This dataset includes monocu...'
      'Example toy dataset.\n\nThis dataset includes monocular eye tracking data from a single participant in a single\nsession. Eye movements are recorded at a sampling frequency of 1000 Hz using an EyeLink Portable\nDuo video-based eye tracker and are provided as pixel coordinates.\n\nThe participant is instructed to read 4 texts with 5 screens each.\n'
    • Experiment
      Experiment
      • EyeTracker
        EyeTracker
        • None
        • None
        • None
        • None
        • 1000
        • None
        • None
      • Screen
        Screen
        • 68
        • 30.2
        • 1024
        • 'upper left'
        • tuple (2 items)
          • 1280
          • 1024
        • tuple (2 items)
          • 38
          • 30.2
        • 38
        • 1280
        • 15.599386487782953
        • -15.599386487782953
        • 12.508044410882546
        • -12.508044410882546
    • list (1 items)
      • ResourceDefinition
        • 'gaze'
        • 'pymovements-toy-dataset.zip'
        • 'trial_{text_id:d}_{page_id:d}.csv'
        • dict (2 items)
          • <class 'int'>
          • <class 'int'>
        • None
        • dict (4 items)
          • 'timestamp'
          • 'ms'
          • (2 more)
        • '256901852c1c07581d375eef705855d6'
        • None
        • WebSource
          WebSource(url='https://github.com/pymovements/pymovements-toy-dataset/archive/refs/heads/main.zip', filename='pymovements-toy-dataset.zip', md5='256901852c1c07581d375eef705855d6', mirrors=None)
        • 'https://github.com/pymovements/pymovements-toy-dat...'
          'https://github.com/pymovements/pymovements-toy-dataset/archive/refs/heads/main.zip'
  • tuple (20 items)
    • Events
      • DataFrame (4 columns, 0 rows)
        shape: (0, 4)
        nameonsetoffsetduration
        stri64i64i64
      • None
    • Events
      • DataFrame (4 columns, 0 rows)
        shape: (0, 4)
        nameonsetoffsetduration
        stri64i64i64
      • None
    • (18 more)
  • dict (1 items)
    • DataFrame (3 columns, 20 rows)
      shape: (20, 3)
      text_idpage_idfilepath
      i64i64str
      01"pymovements-toy-dataset-main/d…
      02"pymovements-toy-dataset-main/d…
      03"pymovements-toy-dataset-main/d…
      04"pymovements-toy-dataset-main/d…
      05"pymovements-toy-dataset-main/d…
      31"pymovements-toy-dataset-main/d…
      32"pymovements-toy-dataset-main/d…
      33"pymovements-toy-dataset-main/d…
      34"pymovements-toy-dataset-main/d…
      35"pymovements-toy-dataset-main/d…
  • list (20 items)
    • Gaze
      • DataFrame (4 columns, 17223 rows)
        shape: (17_223, 4)
        timestimuli_xstimuli_ypixel
        i64f64f64list[f64]
        1988145-1.0-1.0[206.8, 152.4]
        1988146-1.0-1.0[206.9, 152.1]
        1988147-1.0-1.0[207.0, 151.8]
        1988148-1.0-1.0[207.1, 151.7]
        1988149-1.0-1.0[207.0, 151.5]
        2005363-1.0-1.0[361.0, 415.4]
        2005364-1.0-1.0[358.0, 414.5]
        2005365-1.0-1.0[355.8, 413.8]
        2005366-1.0-1.0[353.1, 413.2]
        2005367-1.0-1.0[351.2, 412.9]
      • Events
        Events
        • DataFrame (4 columns, 0 rows)
          shape: (0, 4)
          nameonsetoffsetduration
          stri64i64i64
        • None
      • dict (2 items)
        • 0
        • 1
      • None
      • None
      • Experiment
        Experiment
        • EyeTracker
          EyeTracker
          • None
          • None
          • None
          • None
          • 1000
          • None
          • None
        • Screen
          Screen
          • 68
          • 30.2
          • 1024
          • 'upper left'
          • tuple (2 items)
            • 1280
            • 1024
          • tuple (2 items)
            • 38
            • 30.2
          • 38
          • 1280
          • 15.599386487782953
          • -15.599386487782953
          • 12.508044410882546
          • -12.508044410882546
    • Gaze
      • DataFrame (4 columns, 29799 rows)
        shape: (29_799, 4)
        timestimuli_xstimuli_ypixel
        i64f64f64list[f64]
        2008305-1.0-1.0[141.4, 153.6]
        2008306-1.0-1.0[141.1, 153.2]
        2008307-1.0-1.0[140.7, 152.8]
        2008308-1.0-1.0[140.6, 152.7]
        2008309-1.0-1.0[140.5, 152.6]
        2038099-1.0-1.0[273.8, 773.8]
        2038100-1.0-1.0[273.8, 774.1]
        2038101-1.0-1.0[273.9, 774.5]
        2038102-1.0-1.0[274.0, 774.4]
        2038103-1.0-1.0[274.0, 773.9]
      • Events
        Events
        • DataFrame (4 columns, 0 rows)
          shape: (0, 4)
          nameonsetoffsetduration
          stri64i64i64
        • None
      • dict (2 items)
        • 0
        • 2
      • None
      • None
      • Experiment
        Experiment
        • EyeTracker
          EyeTracker
          • None
          • None
          • None
          • None
          • 1000
          • None
          • None
        • Screen
          Screen
          • 68
          • 30.2
          • 1024
          • 'upper left'
          • tuple (2 items)
            • 1280
            • 1024
          • tuple (2 items)
            • 38
            • 30.2
          • 38
          • 1280
          • 15.599386487782953
          • -15.599386487782953
          • 12.508044410882546
          • -12.508044410882546
    • (18 more)
  • Participants
    Participants
    • DataFrame (1 columns, 0 rows)
      shape: (0, 1)
      participant_id
      str
    • dict (1 items)
      • dict (1 items)
        • 'string'
  • PosixPath('data/ToyDataset')
  • DatasetPaths
    DatasetPaths
    • PosixPath('data/ToyDataset')
    • PosixPath('data/ToyDataset/downloads')
    • PosixPath('data/ToyDataset/events')
    • PosixPath('data/ToyDataset/precomputed_events')
    • PosixPath('data/ToyDataset/precomputed_reading_measures')
    • PosixPath('data/ToyDataset/preprocessed')
    • PosixPath('data/ToyDataset/raw')
    • PosixPath('data/ToyDataset')
    • PosixPath('data/ToyDataset/stimuli')
  • list (0 items)
  • list (0 items)
  • list (0 items)

Now let’s load in the data and do some preprocessing:

dataset.pix2deg()
dataset.pos2vel()

dataset.gaze[0]
Gaze
  • DataFrame (6 columns, 17223 rows)
    shape: (17_223, 6)
    timestimuli_xstimuli_ypixelpositionvelocity
    i64f64f64list[f64]list[f64]list[f64]
    1988145-1.0-1.0[206.8, 152.4][-10.697598, -8.852399][null, null]
    1988146-1.0-1.0[206.9, 152.1][-10.695183, -8.859678][null, null]
    1988147-1.0-1.0[207.0, 151.8][-10.692768, -8.866956][1.610194, -5.256267]
    1988148-1.0-1.0[207.1, 151.7][-10.690352, -8.869381][0.402548, -4.447465]
    1988149-1.0-1.0[207.0, 151.5][-10.692768, -8.874233][0.402561, -3.234462]
    2005363-1.0-1.0[361.0, 415.4][-6.932438, -2.386672][-63.266374, -21.085616]
    2005364-1.0-1.0[358.0, 414.5][-7.006376, -2.408998][-63.249652, -19.431326]
    2005365-1.0-1.0[355.8, 413.8][-7.060582, -2.426362][-60.359624, -15.710061]
    2005366-1.0-1.0[353.1, 413.2][-7.12709, -2.441245][null, null]
    2005367-1.0-1.0[351.2, 412.9][-7.173881, -2.448686][null, null]
  • Events
    Events
    • DataFrame (4 columns, 0 rows)
      shape: (0, 4)
      nameonsetoffsetduration
      stri64i64i64
    • None
  • dict (2 items)
    • 0
    • 1
  • None
  • None
  • Experiment
    Experiment
    • EyeTracker
      EyeTracker
      • None
      • None
      • None
      • None
      • 1000
      • None
      • None
    • Screen
      Screen
      • 68
      • 30.2
      • 1024
      • 'upper left'
      • tuple (2 items)
        • 1280
        • 1024
      • tuple (2 items)
        • 38
        • 30.2
      • 38
      • 1280
      • 15.599386487782953
      • -15.599386487782953
      • 12.508044410882546
      • -12.508044410882546

We have now added some additional columns for degrees in visual angle and velocity.

Saving#

Saving your preprocessed data is as simple as:

dataset.save_preprocessed()
Dataset
  • DatasetDefinition
    DatasetDefinition
    • 'ToyDataset'
    • 'pymovements Toy Dataset'
    • 'Example toy dataset. This dataset includes monocu...'
      'Example toy dataset.\n\nThis dataset includes monocular eye tracking data from a single participant in a single\nsession. Eye movements are recorded at a sampling frequency of 1000 Hz using an EyeLink Portable\nDuo video-based eye tracker and are provided as pixel coordinates.\n\nThe participant is instructed to read 4 texts with 5 screens each.\n'
    • Experiment
      Experiment
      • EyeTracker
        EyeTracker
        • None
        • None
        • None
        • None
        • 1000
        • None
        • None
      • Screen
        Screen
        • 68
        • 30.2
        • 1024
        • 'upper left'
        • tuple (2 items)
          • 1280
          • 1024
        • tuple (2 items)
          • 38
          • 30.2
        • 38
        • 1280
        • 15.599386487782953
        • -15.599386487782953
        • 12.508044410882546
        • -12.508044410882546
    • list (1 items)
      • ResourceDefinition
        • 'gaze'
        • 'pymovements-toy-dataset.zip'
        • 'trial_{text_id:d}_{page_id:d}.csv'
        • dict (2 items)
          • <class 'int'>
          • <class 'int'>
        • None
        • dict (4 items)
          • 'timestamp'
          • 'ms'
          • (2 more)
        • '256901852c1c07581d375eef705855d6'
        • None
        • WebSource
          WebSource(url='https://github.com/pymovements/pymovements-toy-dataset/archive/refs/heads/main.zip', filename='pymovements-toy-dataset.zip', md5='256901852c1c07581d375eef705855d6', mirrors=None)
        • 'https://github.com/pymovements/pymovements-toy-dat...'
          'https://github.com/pymovements/pymovements-toy-dataset/archive/refs/heads/main.zip'
  • tuple (20 items)
    • Events
      • DataFrame (4 columns, 0 rows)
        shape: (0, 4)
        nameonsetoffsetduration
        stri64i64i64
      • None
    • Events
      • DataFrame (4 columns, 0 rows)
        shape: (0, 4)
        nameonsetoffsetduration
        stri64i64i64
      • None
    • (18 more)
  • dict (1 items)
    • DataFrame (3 columns, 20 rows)
      shape: (20, 3)
      text_idpage_idfilepath
      i64i64str
      01"pymovements-toy-dataset-main/d…
      02"pymovements-toy-dataset-main/d…
      03"pymovements-toy-dataset-main/d…
      04"pymovements-toy-dataset-main/d…
      05"pymovements-toy-dataset-main/d…
      31"pymovements-toy-dataset-main/d…
      32"pymovements-toy-dataset-main/d…
      33"pymovements-toy-dataset-main/d…
      34"pymovements-toy-dataset-main/d…
      35"pymovements-toy-dataset-main/d…
  • list (20 items)
    • Gaze
      • DataFrame (6 columns, 17223 rows)
        shape: (17_223, 6)
        timestimuli_xstimuli_ypixelpositionvelocity
        i64f64f64list[f64]list[f64]list[f64]
        1988145-1.0-1.0[206.8, 152.4][-10.697598, -8.852399][null, null]
        1988146-1.0-1.0[206.9, 152.1][-10.695183, -8.859678][null, null]
        1988147-1.0-1.0[207.0, 151.8][-10.692768, -8.866956][1.610194, -5.256267]
        1988148-1.0-1.0[207.1, 151.7][-10.690352, -8.869381][0.402548, -4.447465]
        1988149-1.0-1.0[207.0, 151.5][-10.692768, -8.874233][0.402561, -3.234462]
        2005363-1.0-1.0[361.0, 415.4][-6.932438, -2.386672][-63.266374, -21.085616]
        2005364-1.0-1.0[358.0, 414.5][-7.006376, -2.408998][-63.249652, -19.431326]
        2005365-1.0-1.0[355.8, 413.8][-7.060582, -2.426362][-60.359624, -15.710061]
        2005366-1.0-1.0[353.1, 413.2][-7.12709, -2.441245][null, null]
        2005367-1.0-1.0[351.2, 412.9][-7.173881, -2.448686][null, null]
      • Events
        Events
        • DataFrame (4 columns, 0 rows)
          shape: (0, 4)
          nameonsetoffsetduration
          stri64i64i64
        • None
      • dict (2 items)
        • 0
        • 1
      • None
      • None
      • Experiment
        Experiment
        • EyeTracker
          EyeTracker
          • None
          • None
          • None
          • None
          • 1000
          • None
          • None
        • Screen
          Screen
          • 68
          • 30.2
          • 1024
          • 'upper left'
          • tuple (2 items)
            • 1280
            • 1024
          • tuple (2 items)
            • 38
            • 30.2
          • 38
          • 1280
          • 15.599386487782953
          • -15.599386487782953
          • 12.508044410882546
          • -12.508044410882546
    • Gaze
      • DataFrame (6 columns, 29799 rows)
        shape: (29_799, 6)
        timestimuli_xstimuli_ypixelpositionvelocity
        i64f64f64list[f64]list[f64]list[f64]
        2008305-1.0-1.0[141.4, 153.6][-12.268583, -8.823284][null, null]
        2008306-1.0-1.0[141.1, 153.2][-12.275749, -8.832989][null, null]
        2008307-1.0-1.0[140.7, 152.8][-12.285302, -8.842695][-5.572617, -6.065816]
        2008308-1.0-1.0[140.6, 152.7][-12.28769, -8.845121][-3.582268, -4.043733]
        2008309-1.0-1.0[140.5, 152.6][-12.290078, -8.847547][-2.388085, -2.021821]
        2038099-1.0-1.0[273.8, 773.8][-9.071149, 6.490168][1.21962, 1.635403]
        2038100-1.0-1.0[273.8, 774.1][-9.071149, 6.497527][1.626175, 4.497406]
        2038101-1.0-1.0[273.9, 774.5][-9.06871, 6.50734][1.626186, 1.635423]
        2038102-1.0-1.0[274.0, 774.4][-9.066271, 6.504886][null, null]
        2038103-1.0-1.0[274.0, 773.9][-9.066271, 6.492621][null, null]
      • Events
        Events
        • DataFrame (4 columns, 0 rows)
          shape: (0, 4)
          nameonsetoffsetduration
          stri64i64i64
        • None
      • dict (2 items)
        • 0
        • 2
      • None
      • None
      • Experiment
        Experiment
        • EyeTracker
          EyeTracker
          • None
          • None
          • None
          • None
          • 1000
          • None
          • None
        • Screen
          Screen
          • 68
          • 30.2
          • 1024
          • 'upper left'
          • tuple (2 items)
            • 1280
            • 1024
          • tuple (2 items)
            • 38
            • 30.2
          • 38
          • 1280
          • 15.599386487782953
          • -15.599386487782953
          • 12.508044410882546
          • -12.508044410882546
    • (18 more)
  • Participants
    Participants
    • DataFrame (1 columns, 0 rows)
      shape: (0, 1)
      participant_id
      str
    • dict (1 items)
      • dict (1 items)
        • 'string'
  • PosixPath('data/ToyDataset')
  • DatasetPaths
    DatasetPaths
    • PosixPath('data/ToyDataset')
    • PosixPath('data/ToyDataset/downloads')
    • PosixPath('data/ToyDataset/events')
    • PosixPath('data/ToyDataset/precomputed_events')
    • PosixPath('data/ToyDataset/precomputed_reading_measures')
    • PosixPath('data/ToyDataset/preprocessed')
    • PosixPath('data/ToyDataset/raw')
    • PosixPath('data/ToyDataset')
    • PosixPath('data/ToyDataset/stimuli')
  • list (0 items)
  • list (0 items)
  • list (0 items)

All of the preprocessed data is saved into this directory:

dataset.paths.preprocessed
PosixPath('data/ToyDataset/preprocessed')

Let’s confirm it by printing all the new files in this directory:

print(list(dataset.paths.preprocessed.glob('*/*/*')))
[PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_3_5.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_2_2.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_0_4.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_0_3.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_2_3.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_2_5.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_1_5.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_3_2.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_1_4.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_1_3.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_2_1.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_3_4.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_1_2.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_0_5.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_0_2.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_3_1.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_3_3.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_1_1.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_2_4.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_0_1.feather')]

All of the files have been saved into the Dataset.paths.preprocessed as feather files.

If we want to save the data into an alternative directory and also use a different file format like csv we can use the following:

dataset.save_preprocessed(preprocessed_dirname='preprocessed_csv', extension='csv')
Dataset
  • DatasetDefinition
    DatasetDefinition
    • 'ToyDataset'
    • 'pymovements Toy Dataset'
    • 'Example toy dataset. This dataset includes monocu...'
      'Example toy dataset.\n\nThis dataset includes monocular eye tracking data from a single participant in a single\nsession. Eye movements are recorded at a sampling frequency of 1000 Hz using an EyeLink Portable\nDuo video-based eye tracker and are provided as pixel coordinates.\n\nThe participant is instructed to read 4 texts with 5 screens each.\n'
    • Experiment
      Experiment
      • EyeTracker
        EyeTracker
        • None
        • None
        • None
        • None
        • 1000
        • None
        • None
      • Screen
        Screen
        • 68
        • 30.2
        • 1024
        • 'upper left'
        • tuple (2 items)
          • 1280
          • 1024
        • tuple (2 items)
          • 38
          • 30.2
        • 38
        • 1280
        • 15.599386487782953
        • -15.599386487782953
        • 12.508044410882546
        • -12.508044410882546
    • list (1 items)
      • ResourceDefinition
        • 'gaze'
        • 'pymovements-toy-dataset.zip'
        • 'trial_{text_id:d}_{page_id:d}.csv'
        • dict (2 items)
          • <class 'int'>
          • <class 'int'>
        • None
        • dict (4 items)
          • 'timestamp'
          • 'ms'
          • (2 more)
        • '256901852c1c07581d375eef705855d6'
        • None
        • WebSource
          WebSource(url='https://github.com/pymovements/pymovements-toy-dataset/archive/refs/heads/main.zip', filename='pymovements-toy-dataset.zip', md5='256901852c1c07581d375eef705855d6', mirrors=None)
        • 'https://github.com/pymovements/pymovements-toy-dat...'
          'https://github.com/pymovements/pymovements-toy-dataset/archive/refs/heads/main.zip'
  • tuple (20 items)
    • Events
      • DataFrame (4 columns, 0 rows)
        shape: (0, 4)
        nameonsetoffsetduration
        stri64i64i64
      • None
    • Events
      • DataFrame (4 columns, 0 rows)
        shape: (0, 4)
        nameonsetoffsetduration
        stri64i64i64
      • None
    • (18 more)
  • dict (1 items)
    • DataFrame (3 columns, 20 rows)
      shape: (20, 3)
      text_idpage_idfilepath
      i64i64str
      01"pymovements-toy-dataset-main/d…
      02"pymovements-toy-dataset-main/d…
      03"pymovements-toy-dataset-main/d…
      04"pymovements-toy-dataset-main/d…
      05"pymovements-toy-dataset-main/d…
      31"pymovements-toy-dataset-main/d…
      32"pymovements-toy-dataset-main/d…
      33"pymovements-toy-dataset-main/d…
      34"pymovements-toy-dataset-main/d…
      35"pymovements-toy-dataset-main/d…
  • list (20 items)
    • Gaze
      • DataFrame (6 columns, 17223 rows)
        shape: (17_223, 6)
        timestimuli_xstimuli_ypixelpositionvelocity
        i64f64f64list[f64]list[f64]list[f64]
        1988145-1.0-1.0[206.8, 152.4][-10.697598, -8.852399][null, null]
        1988146-1.0-1.0[206.9, 152.1][-10.695183, -8.859678][null, null]
        1988147-1.0-1.0[207.0, 151.8][-10.692768, -8.866956][1.610194, -5.256267]
        1988148-1.0-1.0[207.1, 151.7][-10.690352, -8.869381][0.402548, -4.447465]
        1988149-1.0-1.0[207.0, 151.5][-10.692768, -8.874233][0.402561, -3.234462]
        2005363-1.0-1.0[361.0, 415.4][-6.932438, -2.386672][-63.266374, -21.085616]
        2005364-1.0-1.0[358.0, 414.5][-7.006376, -2.408998][-63.249652, -19.431326]
        2005365-1.0-1.0[355.8, 413.8][-7.060582, -2.426362][-60.359624, -15.710061]
        2005366-1.0-1.0[353.1, 413.2][-7.12709, -2.441245][null, null]
        2005367-1.0-1.0[351.2, 412.9][-7.173881, -2.448686][null, null]
      • Events
        Events
        • DataFrame (4 columns, 0 rows)
          shape: (0, 4)
          nameonsetoffsetduration
          stri64i64i64
        • None
      • dict (2 items)
        • 0
        • 1
      • None
      • None
      • Experiment
        Experiment
        • EyeTracker
          EyeTracker
          • None
          • None
          • None
          • None
          • 1000
          • None
          • None
        • Screen
          Screen
          • 68
          • 30.2
          • 1024
          • 'upper left'
          • tuple (2 items)
            • 1280
            • 1024
          • tuple (2 items)
            • 38
            • 30.2
          • 38
          • 1280
          • 15.599386487782953
          • -15.599386487782953
          • 12.508044410882546
          • -12.508044410882546
    • Gaze
      • DataFrame (6 columns, 29799 rows)
        shape: (29_799, 6)
        timestimuli_xstimuli_ypixelpositionvelocity
        i64f64f64list[f64]list[f64]list[f64]
        2008305-1.0-1.0[141.4, 153.6][-12.268583, -8.823284][null, null]
        2008306-1.0-1.0[141.1, 153.2][-12.275749, -8.832989][null, null]
        2008307-1.0-1.0[140.7, 152.8][-12.285302, -8.842695][-5.572617, -6.065816]
        2008308-1.0-1.0[140.6, 152.7][-12.28769, -8.845121][-3.582268, -4.043733]
        2008309-1.0-1.0[140.5, 152.6][-12.290078, -8.847547][-2.388085, -2.021821]
        2038099-1.0-1.0[273.8, 773.8][-9.071149, 6.490168][1.21962, 1.635403]
        2038100-1.0-1.0[273.8, 774.1][-9.071149, 6.497527][1.626175, 4.497406]
        2038101-1.0-1.0[273.9, 774.5][-9.06871, 6.50734][1.626186, 1.635423]
        2038102-1.0-1.0[274.0, 774.4][-9.066271, 6.504886][null, null]
        2038103-1.0-1.0[274.0, 773.9][-9.066271, 6.492621][null, null]
      • Events
        Events
        • DataFrame (4 columns, 0 rows)
          shape: (0, 4)
          nameonsetoffsetduration
          stri64i64i64
        • None
      • dict (2 items)
        • 0
        • 2
      • None
      • None
      • Experiment
        Experiment
        • EyeTracker
          EyeTracker
          • None
          • None
          • None
          • None
          • 1000
          • None
          • None
        • Screen
          Screen
          • 68
          • 30.2
          • 1024
          • 'upper left'
          • tuple (2 items)
            • 1280
            • 1024
          • tuple (2 items)
            • 38
            • 30.2
          • 38
          • 1280
          • 15.599386487782953
          • -15.599386487782953
          • 12.508044410882546
          • -12.508044410882546
    • (18 more)
  • Participants
    Participants
    • DataFrame (1 columns, 0 rows)
      shape: (0, 1)
      participant_id
      str
    • dict (1 items)
      • dict (1 items)
        • 'string'
  • PosixPath('data/ToyDataset')
  • DatasetPaths
    DatasetPaths
    • PosixPath('data/ToyDataset')
    • PosixPath('data/ToyDataset/downloads')
    • PosixPath('data/ToyDataset/events')
    • PosixPath('data/ToyDataset/precomputed_events')
    • PosixPath('data/ToyDataset/precomputed_reading_measures')
    • PosixPath('data/ToyDataset/preprocessed')
    • PosixPath('data/ToyDataset/raw')
    • PosixPath('data/ToyDataset')
    • PosixPath('data/ToyDataset/stimuli')
  • list (0 items)
  • list (0 items)
  • list (0 items)

Let’s confirm again by printing all the new files in this alternative directory:

alternative_dirpath = dataset.path / 'preprocessed_csv'
print(list(alternative_dirpath.glob('*/*/*')))
[PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_2_1.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_2_4.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_1_5.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_3_5.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_0_1.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_2_5.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_0_4.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_0_5.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_1_3.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_3_1.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_0_2.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_1_2.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_2_3.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_0_3.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_1_4.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_1_1.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_2_2.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_3_4.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_3_3.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_3_2.csv')]

Loading#

Now let’s imagine that this preprocessing and saving was done in another file, and we only want to load the preprocessed data.

We simulate this by initializing a new dataset object. We don’t need to download any additional data.

events_dataset = pm.Dataset('ToyDataset', path='data/ToyDataset')

The preprocessed data can now simply be loaded by setting preprocessed to True:

events_dataset.load(preprocessed=True)

events_dataset.gaze[0]
Gaze
  • DataFrame (6 columns, 17223 rows)
    shape: (17_223, 6)
    timestimuli_xstimuli_ypixelpositionvelocity
    i64f64f64list[f64]list[f64]list[f64]
    1988145-1.0-1.0[206.8, 152.4][-10.697598, -8.852399][null, null]
    1988146-1.0-1.0[206.9, 152.1][-10.695183, -8.859678][null, null]
    1988147-1.0-1.0[207.0, 151.8][-10.692768, -8.866956][1.610194, -5.256267]
    1988148-1.0-1.0[207.1, 151.7][-10.690352, -8.869381][0.402548, -4.447465]
    1988149-1.0-1.0[207.0, 151.5][-10.692768, -8.874233][0.402561, -3.234462]
    2005363-1.0-1.0[361.0, 415.4][-6.932438, -2.386672][-63.266374, -21.085616]
    2005364-1.0-1.0[358.0, 414.5][-7.006376, -2.408998][-63.249652, -19.431326]
    2005365-1.0-1.0[355.8, 413.8][-7.060582, -2.426362][-60.359624, -15.710061]
    2005366-1.0-1.0[353.1, 413.2][-7.12709, -2.441245][null, null]
    2005367-1.0-1.0[351.2, 412.9][-7.173881, -2.448686][null, null]
  • Events
    Events
    • DataFrame (4 columns, 0 rows)
      shape: (0, 4)
      nameonsetoffsetduration
      stri64i64i64
    • None
  • dict (2 items)
    • 0
    • 1
  • None
  • None
  • Experiment
    Experiment
    • EyeTracker
      EyeTracker
      • None
      • None
      • None
      • None
      • 1000
      • None
      • None
    • Screen
      Screen
      • 68
      • 30.2
      • 1024
      • 'upper left'
      • tuple (2 items)
        • 1280
        • 1024
      • tuple (2 items)
        • 38
        • 30.2
      • 38
      • 1280
      • 15.599386487782953
      • -15.599386487782953
      • 12.508044410882546
      • -12.508044410882546

By default, the preprocessed directory and the feather extension will be chosen.

In the case of alternative directory names or other file formats, you can use the following:

events_dataset.load(
    preprocessed=True,
    preprocessed_dirname='preprocessed_csv',
    extension='csv',
)
events_dataset.gaze[0]
Gaze
  • DataFrame (6 columns, 17223 rows)
    shape: (17_223, 6)
    timestimuli_xstimuli_ypixelpositionvelocity
    i64f64f64list[f64]list[f64]list[f64]
    1988145-1.0-1.0[206.8, 152.4][-10.697598, -8.852399][null, null]
    1988146-1.0-1.0[206.9, 152.1][-10.695183, -8.859678][null, null]
    1988147-1.0-1.0[207.0, 151.8][-10.692768, -8.866956][1.610194, -5.256267]
    1988148-1.0-1.0[207.1, 151.7][-10.690352, -8.869381][0.402548, -4.447465]
    1988149-1.0-1.0[207.0, 151.5][-10.692768, -8.874233][0.402561, -3.234462]
    2005363-1.0-1.0[361.0, 415.4][-6.932438, -2.386672][-63.266374, -21.085616]
    2005364-1.0-1.0[358.0, 414.5][-7.006376, -2.408998][-63.249652, -19.431326]
    2005365-1.0-1.0[355.8, 413.8][-7.060582, -2.426362][-60.359624, -15.710061]
    2005366-1.0-1.0[353.1, 413.2][-7.12709, -2.441245][null, null]
    2005367-1.0-1.0[351.2, 412.9][-7.173881, -2.448686][null, null]
  • Events
    Events
    • DataFrame (4 columns, 0 rows)
      shape: (0, 4)
      nameonsetoffsetduration
      stri64i64i64
    • None
  • dict (2 items)
    • 0
    • 1
  • None
  • None
  • None

What you have learned in this tutorial:#