Saving and Loading Preprocessed Data#

What you will learn in this tutorial:#

  • how to save your preprocessed data

  • how to load your preprocessed data

Preparations#

We import pymovements as the alias pm for convenience.

import pymovements as pm

Let’s start by downloading our ToyDataset and loading in its data:

dataset = pm.Dataset('ToyDataset', path='data/ToyDataset')
dataset.download()
dataset.load()
INFO:pymovements.dataset.dataset:
        You are downloading the pymovements Toy Dataset. Please be aware that pymovements does not
        host or distribute any dataset resources and only provides a convenient interface to
        download the public dataset resources that were published by their respective authors.

        Please cite the referenced publication if you intend to use the dataset in your research.
        
Using already downloaded and verified file: data/ToyDataset/downloads/pymovements-toy-dataset.zip
Extracting pymovements-toy-dataset.zip to data/ToyDataset/raw
  0%|          | 0/23 [00:00<?, ?it/s]
100%|██████████| 23/23 [00:00<00:00, 341.06it/s]

Dataset
  • DatasetDefinition
    DatasetDefinition
    • None
      None
    • dict (0 items)
      • dict (1 items)
        • dict (4 items)
          • list (5 items)
            • 'timestamp'
            • 'x'
            • (3 more)
          • dict (5 items)
            • Float64
              Float64
            • Float64
              Float64
            • (3 more)
          • (2 more)
      • None
        None
      • Experiment
        Experiment
        • EyeTracker
          EyeTracker
          • None
            None
          • None
            None
          • None
            None
          • None
            None
          • 1000
            1000
          • None
            None
          • None
            None
        • 1000
          1000
        • Screen
          Screen
          • 68
            68
          • 30.2
            30.2
          • 1024
            1024
          • 'upper left'
            'upper left'
          • 38
            38
          • 1280
            1280
          • 15.599386487782953
            15.599386487782953
          • -15.599386487782953
            -15.599386487782953
          • 12.508044410882546
            12.508044410882546
          • -12.508044410882546
            -12.508044410882546
      • None
        None
      • dict (1 items)
        • 'trial_{text_id:d}_{page_id:d}.csv'
          'trial_{text_id:d}_{page_id:d}.csv'
      • dict (1 items)
        • dict (2 items)
          • <class 'int'>
            <class 'int'>
          • <class 'int'>
            <class 'int'>
      • True
        True
      • 'pymovements Toy Dataset'
        'pymovements Toy Dataset'
      • dict (0 items)
        • 'ToyDataset'
          'ToyDataset'
        • list (2 items)
          • 'x'
          • 'y'
        • None
          None
        • list (1 items)
          • ResourceDefinition
            • 'gaze'
              'gaze'
            • 'pymovements-toy-dataset.zip'
              'pymovements-toy-dataset.zip'
            • 'trial_{text_id:d}_{page_id:d}.csv'
              'trial_{text_id:d}_{page_id:d}.csv'
            • dict (2 items)
              • <class 'int'>
                <class 'int'>
              • <class 'int'>
                <class 'int'>
            • None
              None
            • None
              None
            • '256901852c1c07581d375eef705855d6'
              '256901852c1c07581d375eef705855d6'
            • None
              None
            • str
              'https://github.com/pymovements/pymovements-toy-dataset/archive/refs/heads/main.zip'
        • 'timestamp'
          'timestamp'
        • 'ms'
          'ms'
        • None
          None
        • None
          None
      • tuple
        (shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘)
      • dict (1 items)
        • DataFrame (5 columns, 20 rows)
          shape: (20, 5)
          text_idpage_idfilepathload_functionload_kwargs
          i64i64strnullnull
          01"pymovements-toy-dataset-main/d…nullnull
          02"pymovements-toy-dataset-main/d…nullnull
          03"pymovements-toy-dataset-main/d…nullnull
          04"pymovements-toy-dataset-main/d…nullnull
          05"pymovements-toy-dataset-main/d…nullnull
          31"pymovements-toy-dataset-main/d…nullnull
          32"pymovements-toy-dataset-main/d…nullnull
          33"pymovements-toy-dataset-main/d…nullnull
          34"pymovements-toy-dataset-main/d…nullnull
          35"pymovements-toy-dataset-main/d…nullnull
      • list (20 items)
        • Gaze
          • DataFrame (6 columns, 17223 rows)
            shape: (17_223, 6)
            timestimuli_xstimuli_ytext_idpage_idpixel
            i64f64f64i64i64list[f64]
            1988145-1.0-1.001[206.8, 152.4]
            1988146-1.0-1.001[206.9, 152.1]
            1988147-1.0-1.001[207.0, 151.8]
            1988148-1.0-1.001[207.1, 151.7]
            1988149-1.0-1.001[207.0, 151.5]
            2005363-1.0-1.001[361.0, 415.4]
            2005364-1.0-1.001[358.0, 414.5]
            2005365-1.0-1.001[355.8, 413.8]
            2005366-1.0-1.001[353.1, 413.2]
            2005367-1.0-1.001[351.2, 412.9]
          • Events
            Events
            • DataFrame (6 columns, 0 rows)
              shape: (0, 6)
              text_idpage_idnameonsetoffsetduration
              i64i64stri64i64i64
            • list (2 items)
              • 'text_id'
              • 'page_id'
          • list (2 items)
            • 'text_id'
            • 'page_id'
          • Experiment
            Experiment
            • EyeTracker
              EyeTracker
              • None
                None
              • None
                None
              • None
                None
              • None
                None
              • 1000
                1000
              • None
                None
              • None
                None
            • 1000
              1000
            • Screen
              Screen
              • 68
                68
              • 30.2
                30.2
              • 1024
                1024
              • 'upper left'
                'upper left'
              • 38
                38
              • 1280
                1280
              • 15.599386487782953
                15.599386487782953
              • -15.599386487782953
                -15.599386487782953
              • 12.508044410882546
                12.508044410882546
              • -12.508044410882546
                -12.508044410882546
        • Gaze
          • DataFrame (6 columns, 29799 rows)
            shape: (29_799, 6)
            timestimuli_xstimuli_ytext_idpage_idpixel
            i64f64f64i64i64list[f64]
            2008305-1.0-1.002[141.4, 153.6]
            2008306-1.0-1.002[141.1, 153.2]
            2008307-1.0-1.002[140.7, 152.8]
            2008308-1.0-1.002[140.6, 152.7]
            2008309-1.0-1.002[140.5, 152.6]
            2038099-1.0-1.002[273.8, 773.8]
            2038100-1.0-1.002[273.8, 774.1]
            2038101-1.0-1.002[273.9, 774.5]
            2038102-1.0-1.002[274.0, 774.4]
            2038103-1.0-1.002[274.0, 773.9]
          • Events
            Events
            • DataFrame (6 columns, 0 rows)
              shape: (0, 6)
              text_idpage_idnameonsetoffsetduration
              i64i64stri64i64i64
            • list (2 items)
              • 'text_id'
              • 'page_id'
          • list (2 items)
            • 'text_id'
            • 'page_id'
          • Experiment
            Experiment
            • EyeTracker
              EyeTracker
              • None
                None
              • None
                None
              • None
                None
              • None
                None
              • 1000
                1000
              • None
                None
              • None
                None
            • 1000
              1000
            • Screen
              Screen
              • 68
                68
              • 30.2
                30.2
              • 1024
                1024
              • 'upper left'
                'upper left'
              • 38
                38
              • 1280
                1280
              • 15.599386487782953
                15.599386487782953
              • -15.599386487782953
                -15.599386487782953
              • 12.508044410882546
                12.508044410882546
              • -12.508044410882546
                -12.508044410882546
        • (18 more)
      • PosixPath('data/ToyDataset')
        PosixPath('data/ToyDataset')
      • DatasetPaths
        DatasetPaths
        • PosixPath('data/ToyDataset')
          PosixPath('data/ToyDataset')
        • PosixPath('data/ToyDataset/downloads')
          PosixPath('data/ToyDataset/downloads')
        • PosixPath('data/ToyDataset/events')
          PosixPath('data/ToyDataset/events')
        • PosixPath('data/ToyDataset/precomputed_events')
          PosixPath('data/ToyDataset/precomputed_events')
        • PosixPath
          PosixPath('data/ToyDataset/precomputed_reading_measures')
        • PosixPath('data/ToyDataset/preprocessed')
          PosixPath('data/ToyDataset/preprocessed')
        • PosixPath('data/ToyDataset/raw')
          PosixPath('data/ToyDataset/raw')
        • PosixPath('data/ToyDataset')
          PosixPath('data/ToyDataset')
      • list (0 items)
        • list (0 items)

          Now let’s load in the data and do some preprocessing:

          dataset.pix2deg()
          dataset.pos2vel()
          
          dataset.gaze[0]
          
          Gaze
          • DataFrame (8 columns, 17223 rows)
            shape: (17_223, 8)
            timestimuli_xstimuli_ytext_idpage_idpixelpositionvelocity
            i64f64f64i64i64list[f64]list[f64]list[f64]
            1988145-1.0-1.001[206.8, 152.4][-10.697598, -8.852399][null, null]
            1988146-1.0-1.001[206.9, 152.1][-10.695183, -8.859678][null, null]
            1988147-1.0-1.001[207.0, 151.8][-10.692768, -8.866956][1.610194, -5.256267]
            1988148-1.0-1.001[207.1, 151.7][-10.690352, -8.869381][0.402548, -4.447465]
            1988149-1.0-1.001[207.0, 151.5][-10.692768, -8.874233][0.402561, -3.234462]
            2005363-1.0-1.001[361.0, 415.4][-6.932438, -2.386672][-63.266374, -21.085616]
            2005364-1.0-1.001[358.0, 414.5][-7.006376, -2.408998][-63.249652, -19.431326]
            2005365-1.0-1.001[355.8, 413.8][-7.060582, -2.426362][-60.359624, -15.710061]
            2005366-1.0-1.001[353.1, 413.2][-7.12709, -2.441245][null, null]
            2005367-1.0-1.001[351.2, 412.9][-7.173881, -2.448686][null, null]
          • Events
            Events
            • DataFrame (6 columns, 0 rows)
              shape: (0, 6)
              text_idpage_idnameonsetoffsetduration
              i64i64stri64i64i64
            • list (2 items)
              • 'text_id'
              • 'page_id'
          • list (2 items)
            • 'text_id'
            • 'page_id'
          • Experiment
            Experiment
            • EyeTracker
              EyeTracker
              • None
                None
              • None
                None
              • None
                None
              • None
                None
              • 1000
                1000
              • None
                None
              • None
                None
            • 1000
              1000
            • Screen
              Screen
              • 68
                68
              • 30.2
                30.2
              • 1024
                1024
              • 'upper left'
                'upper left'
              • 38
                38
              • 1280
                1280
              • 15.599386487782953
                15.599386487782953
              • -15.599386487782953
                -15.599386487782953
              • 12.508044410882546
                12.508044410882546
              • -12.508044410882546
                -12.508044410882546

          We have now added some additional columns for degrees in visual angle and velocity.

          Saving#

          Saving your preprocessed data is as simple as:

          dataset.save_preprocessed()
          
          Dataset
          • DatasetDefinition
            DatasetDefinition
            • None
              None
            • dict (0 items)
              • dict (1 items)
                • dict (4 items)
                  • list (5 items)
                    • 'timestamp'
                    • 'x'
                    • (3 more)
                  • dict (5 items)
                    • Float64
                      Float64
                    • Float64
                      Float64
                    • (3 more)
                  • (2 more)
              • None
                None
              • Experiment
                Experiment
                • EyeTracker
                  EyeTracker
                  • None
                    None
                  • None
                    None
                  • None
                    None
                  • None
                    None
                  • 1000
                    1000
                  • None
                    None
                  • None
                    None
                • 1000
                  1000
                • Screen
                  Screen
                  • 68
                    68
                  • 30.2
                    30.2
                  • 1024
                    1024
                  • 'upper left'
                    'upper left'
                  • 38
                    38
                  • 1280
                    1280
                  • 15.599386487782953
                    15.599386487782953
                  • -15.599386487782953
                    -15.599386487782953
                  • 12.508044410882546
                    12.508044410882546
                  • -12.508044410882546
                    -12.508044410882546
              • None
                None
              • dict (1 items)
                • 'trial_{text_id:d}_{page_id:d}.csv'
                  'trial_{text_id:d}_{page_id:d}.csv'
              • dict (1 items)
                • dict (2 items)
                  • <class 'int'>
                    <class 'int'>
                  • <class 'int'>
                    <class 'int'>
              • True
                True
              • 'pymovements Toy Dataset'
                'pymovements Toy Dataset'
              • dict (0 items)
                • 'ToyDataset'
                  'ToyDataset'
                • list (2 items)
                  • 'x'
                  • 'y'
                • None
                  None
                • list (1 items)
                  • ResourceDefinition
                    • 'gaze'
                      'gaze'
                    • 'pymovements-toy-dataset.zip'
                      'pymovements-toy-dataset.zip'
                    • 'trial_{text_id:d}_{page_id:d}.csv'
                      'trial_{text_id:d}_{page_id:d}.csv'
                    • dict (2 items)
                      • <class 'int'>
                        <class 'int'>
                      • <class 'int'>
                        <class 'int'>
                    • None
                      None
                    • None
                      None
                    • '256901852c1c07581d375eef705855d6'
                      '256901852c1c07581d375eef705855d6'
                    • None
                      None
                    • str
                      'https://github.com/pymovements/pymovements-toy-dataset/archive/refs/heads/main.zip'
                • 'timestamp'
                  'timestamp'
                • 'ms'
                  'ms'
                • None
                  None
                • None
                  None
              • tuple
                (shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘)
              • dict (1 items)
                • DataFrame (5 columns, 20 rows)
                  shape: (20, 5)
                  text_idpage_idfilepathload_functionload_kwargs
                  i64i64strnullnull
                  01"pymovements-toy-dataset-main/d…nullnull
                  02"pymovements-toy-dataset-main/d…nullnull
                  03"pymovements-toy-dataset-main/d…nullnull
                  04"pymovements-toy-dataset-main/d…nullnull
                  05"pymovements-toy-dataset-main/d…nullnull
                  31"pymovements-toy-dataset-main/d…nullnull
                  32"pymovements-toy-dataset-main/d…nullnull
                  33"pymovements-toy-dataset-main/d…nullnull
                  34"pymovements-toy-dataset-main/d…nullnull
                  35"pymovements-toy-dataset-main/d…nullnull
              • list (20 items)
                • Gaze
                  • DataFrame (8 columns, 17223 rows)
                    shape: (17_223, 8)
                    timestimuli_xstimuli_ytext_idpage_idpixelpositionvelocity
                    i64f64f64i64i64list[f64]list[f64]list[f64]
                    1988145-1.0-1.001[206.8, 152.4][-10.697598, -8.852399][null, null]
                    1988146-1.0-1.001[206.9, 152.1][-10.695183, -8.859678][null, null]
                    1988147-1.0-1.001[207.0, 151.8][-10.692768, -8.866956][1.610194, -5.256267]
                    1988148-1.0-1.001[207.1, 151.7][-10.690352, -8.869381][0.402548, -4.447465]
                    1988149-1.0-1.001[207.0, 151.5][-10.692768, -8.874233][0.402561, -3.234462]
                    2005363-1.0-1.001[361.0, 415.4][-6.932438, -2.386672][-63.266374, -21.085616]
                    2005364-1.0-1.001[358.0, 414.5][-7.006376, -2.408998][-63.249652, -19.431326]
                    2005365-1.0-1.001[355.8, 413.8][-7.060582, -2.426362][-60.359624, -15.710061]
                    2005366-1.0-1.001[353.1, 413.2][-7.12709, -2.441245][null, null]
                    2005367-1.0-1.001[351.2, 412.9][-7.173881, -2.448686][null, null]
                  • Events
                    Events
                    • DataFrame (6 columns, 0 rows)
                      shape: (0, 6)
                      text_idpage_idnameonsetoffsetduration
                      i64i64stri64i64i64
                    • list (2 items)
                      • 'text_id'
                      • 'page_id'
                  • list (2 items)
                    • 'text_id'
                    • 'page_id'
                  • Experiment
                    Experiment
                    • EyeTracker
                      EyeTracker
                      • None
                        None
                      • None
                        None
                      • None
                        None
                      • None
                        None
                      • 1000
                        1000
                      • None
                        None
                      • None
                        None
                    • 1000
                      1000
                    • Screen
                      Screen
                      • 68
                        68
                      • 30.2
                        30.2
                      • 1024
                        1024
                      • 'upper left'
                        'upper left'
                      • 38
                        38
                      • 1280
                        1280
                      • 15.599386487782953
                        15.599386487782953
                      • -15.599386487782953
                        -15.599386487782953
                      • 12.508044410882546
                        12.508044410882546
                      • -12.508044410882546
                        -12.508044410882546
                • Gaze
                  • DataFrame (8 columns, 29799 rows)
                    shape: (29_799, 8)
                    timestimuli_xstimuli_ytext_idpage_idpixelpositionvelocity
                    i64f64f64i64i64list[f64]list[f64]list[f64]
                    2008305-1.0-1.002[141.4, 153.6][-12.268583, -8.823284][null, null]
                    2008306-1.0-1.002[141.1, 153.2][-12.275749, -8.832989][null, null]
                    2008307-1.0-1.002[140.7, 152.8][-12.285302, -8.842695][-5.572617, -6.065816]
                    2008308-1.0-1.002[140.6, 152.7][-12.28769, -8.845121][-3.582268, -4.043733]
                    2008309-1.0-1.002[140.5, 152.6][-12.290078, -8.847547][-2.388085, -2.021821]
                    2038099-1.0-1.002[273.8, 773.8][-9.071149, 6.490168][1.21962, 1.635403]
                    2038100-1.0-1.002[273.8, 774.1][-9.071149, 6.497527][1.626175, 4.497406]
                    2038101-1.0-1.002[273.9, 774.5][-9.06871, 6.50734][1.626186, 1.635423]
                    2038102-1.0-1.002[274.0, 774.4][-9.066271, 6.504886][null, null]
                    2038103-1.0-1.002[274.0, 773.9][-9.066271, 6.492621][null, null]
                  • Events
                    Events
                    • DataFrame (6 columns, 0 rows)
                      shape: (0, 6)
                      text_idpage_idnameonsetoffsetduration
                      i64i64stri64i64i64
                    • list (2 items)
                      • 'text_id'
                      • 'page_id'
                  • list (2 items)
                    • 'text_id'
                    • 'page_id'
                  • Experiment
                    Experiment
                    • EyeTracker
                      EyeTracker
                      • None
                        None
                      • None
                        None
                      • None
                        None
                      • None
                        None
                      • 1000
                        1000
                      • None
                        None
                      • None
                        None
                    • 1000
                      1000
                    • Screen
                      Screen
                      • 68
                        68
                      • 30.2
                        30.2
                      • 1024
                        1024
                      • 'upper left'
                        'upper left'
                      • 38
                        38
                      • 1280
                        1280
                      • 15.599386487782953
                        15.599386487782953
                      • -15.599386487782953
                        -15.599386487782953
                      • 12.508044410882546
                        12.508044410882546
                      • -12.508044410882546
                        -12.508044410882546
                • (18 more)
              • PosixPath('data/ToyDataset')
                PosixPath('data/ToyDataset')
              • DatasetPaths
                DatasetPaths
                • PosixPath('data/ToyDataset')
                  PosixPath('data/ToyDataset')
                • PosixPath('data/ToyDataset/downloads')
                  PosixPath('data/ToyDataset/downloads')
                • PosixPath('data/ToyDataset/events')
                  PosixPath('data/ToyDataset/events')
                • PosixPath('data/ToyDataset/precomputed_events')
                  PosixPath('data/ToyDataset/precomputed_events')
                • PosixPath
                  PosixPath('data/ToyDataset/precomputed_reading_measures')
                • PosixPath('data/ToyDataset/preprocessed')
                  PosixPath('data/ToyDataset/preprocessed')
                • PosixPath('data/ToyDataset/raw')
                  PosixPath('data/ToyDataset/raw')
                • PosixPath('data/ToyDataset')
                  PosixPath('data/ToyDataset')
              • list (0 items)
                • list (0 items)

                  All of the preprocessed data is saved into this directory:

                  dataset.paths.preprocessed
                  
                  PosixPath('data/ToyDataset/preprocessed')
                  

                  Let’s confirm it by printing all the new files in this directory:

                  print(list(dataset.paths.preprocessed.glob('*/*/*')))
                  
                  [PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_0_1.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_1_2.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_0_2.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_3_5.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_1_4.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_0_4.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_3_4.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_2_3.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_2_2.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_3_1.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_1_5.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_3_3.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_2_4.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_3_2.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_2_5.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_1_3.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_0_3.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_1_1.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_0_5.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_2_1.feather')]
                  

                  All of the files have been saved into the Dataset.paths.preprocessed as feather files.

                  If we want to save the data into an alternative directory and also use a different file format like csv we can use the following:

                  dataset.save_preprocessed(preprocessed_dirname='preprocessed_csv', extension='csv')
                  
                  Dataset
                  • DatasetDefinition
                    DatasetDefinition
                    • None
                      None
                    • dict (0 items)
                      • dict (1 items)
                        • dict (4 items)
                          • list (5 items)
                            • 'timestamp'
                            • 'x'
                            • (3 more)
                          • dict (5 items)
                            • Float64
                              Float64
                            • Float64
                              Float64
                            • (3 more)
                          • (2 more)
                      • None
                        None
                      • Experiment
                        Experiment
                        • EyeTracker
                          EyeTracker
                          • None
                            None
                          • None
                            None
                          • None
                            None
                          • None
                            None
                          • 1000
                            1000
                          • None
                            None
                          • None
                            None
                        • 1000
                          1000
                        • Screen
                          Screen
                          • 68
                            68
                          • 30.2
                            30.2
                          • 1024
                            1024
                          • 'upper left'
                            'upper left'
                          • 38
                            38
                          • 1280
                            1280
                          • 15.599386487782953
                            15.599386487782953
                          • -15.599386487782953
                            -15.599386487782953
                          • 12.508044410882546
                            12.508044410882546
                          • -12.508044410882546
                            -12.508044410882546
                      • None
                        None
                      • dict (1 items)
                        • 'trial_{text_id:d}_{page_id:d}.csv'
                          'trial_{text_id:d}_{page_id:d}.csv'
                      • dict (1 items)
                        • dict (2 items)
                          • <class 'int'>
                            <class 'int'>
                          • <class 'int'>
                            <class 'int'>
                      • True
                        True
                      • 'pymovements Toy Dataset'
                        'pymovements Toy Dataset'
                      • dict (0 items)
                        • 'ToyDataset'
                          'ToyDataset'
                        • list (2 items)
                          • 'x'
                          • 'y'
                        • None
                          None
                        • list (1 items)
                          • ResourceDefinition
                            • 'gaze'
                              'gaze'
                            • 'pymovements-toy-dataset.zip'
                              'pymovements-toy-dataset.zip'
                            • 'trial_{text_id:d}_{page_id:d}.csv'
                              'trial_{text_id:d}_{page_id:d}.csv'
                            • dict (2 items)
                              • <class 'int'>
                                <class 'int'>
                              • <class 'int'>
                                <class 'int'>
                            • None
                              None
                            • None
                              None
                            • '256901852c1c07581d375eef705855d6'
                              '256901852c1c07581d375eef705855d6'
                            • None
                              None
                            • str
                              'https://github.com/pymovements/pymovements-toy-dataset/archive/refs/heads/main.zip'
                        • 'timestamp'
                          'timestamp'
                        • 'ms'
                          'ms'
                        • None
                          None
                        • None
                          None
                      • tuple
                        (shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘, shape: (0, 6) ┌─────────┬─────────┬──────┬───────┬────────┬──────────┐ │ text_id ┆ page_id ┆ name ┆ onset ┆ offset ┆ duration │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════════╪═════════╪══════╪═══════╪════════╪══════════╡ └─────────┴─────────┴──────┴───────┴────────┴──────────┘)
                      • dict (1 items)
                        • DataFrame (5 columns, 20 rows)
                          shape: (20, 5)
                          text_idpage_idfilepathload_functionload_kwargs
                          i64i64strnullnull
                          01"pymovements-toy-dataset-main/d…nullnull
                          02"pymovements-toy-dataset-main/d…nullnull
                          03"pymovements-toy-dataset-main/d…nullnull
                          04"pymovements-toy-dataset-main/d…nullnull
                          05"pymovements-toy-dataset-main/d…nullnull
                          31"pymovements-toy-dataset-main/d…nullnull
                          32"pymovements-toy-dataset-main/d…nullnull
                          33"pymovements-toy-dataset-main/d…nullnull
                          34"pymovements-toy-dataset-main/d…nullnull
                          35"pymovements-toy-dataset-main/d…nullnull
                      • list (20 items)
                        • Gaze
                          • DataFrame (8 columns, 17223 rows)
                            shape: (17_223, 8)
                            timestimuli_xstimuli_ytext_idpage_idpixelpositionvelocity
                            i64f64f64i64i64list[f64]list[f64]list[f64]
                            1988145-1.0-1.001[206.8, 152.4][-10.697598, -8.852399][null, null]
                            1988146-1.0-1.001[206.9, 152.1][-10.695183, -8.859678][null, null]
                            1988147-1.0-1.001[207.0, 151.8][-10.692768, -8.866956][1.610194, -5.256267]
                            1988148-1.0-1.001[207.1, 151.7][-10.690352, -8.869381][0.402548, -4.447465]
                            1988149-1.0-1.001[207.0, 151.5][-10.692768, -8.874233][0.402561, -3.234462]
                            2005363-1.0-1.001[361.0, 415.4][-6.932438, -2.386672][-63.266374, -21.085616]
                            2005364-1.0-1.001[358.0, 414.5][-7.006376, -2.408998][-63.249652, -19.431326]
                            2005365-1.0-1.001[355.8, 413.8][-7.060582, -2.426362][-60.359624, -15.710061]
                            2005366-1.0-1.001[353.1, 413.2][-7.12709, -2.441245][null, null]
                            2005367-1.0-1.001[351.2, 412.9][-7.173881, -2.448686][null, null]
                          • Events
                            Events
                            • DataFrame (6 columns, 0 rows)
                              shape: (0, 6)
                              text_idpage_idnameonsetoffsetduration
                              i64i64stri64i64i64
                            • list (2 items)
                              • 'text_id'
                              • 'page_id'
                          • list (2 items)
                            • 'text_id'
                            • 'page_id'
                          • Experiment
                            Experiment
                            • EyeTracker
                              EyeTracker
                              • None
                                None
                              • None
                                None
                              • None
                                None
                              • None
                                None
                              • 1000
                                1000
                              • None
                                None
                              • None
                                None
                            • 1000
                              1000
                            • Screen
                              Screen
                              • 68
                                68
                              • 30.2
                                30.2
                              • 1024
                                1024
                              • 'upper left'
                                'upper left'
                              • 38
                                38
                              • 1280
                                1280
                              • 15.599386487782953
                                15.599386487782953
                              • -15.599386487782953
                                -15.599386487782953
                              • 12.508044410882546
                                12.508044410882546
                              • -12.508044410882546
                                -12.508044410882546
                        • Gaze
                          • DataFrame (8 columns, 29799 rows)
                            shape: (29_799, 8)
                            timestimuli_xstimuli_ytext_idpage_idpixelpositionvelocity
                            i64f64f64i64i64list[f64]list[f64]list[f64]
                            2008305-1.0-1.002[141.4, 153.6][-12.268583, -8.823284][null, null]
                            2008306-1.0-1.002[141.1, 153.2][-12.275749, -8.832989][null, null]
                            2008307-1.0-1.002[140.7, 152.8][-12.285302, -8.842695][-5.572617, -6.065816]
                            2008308-1.0-1.002[140.6, 152.7][-12.28769, -8.845121][-3.582268, -4.043733]
                            2008309-1.0-1.002[140.5, 152.6][-12.290078, -8.847547][-2.388085, -2.021821]
                            2038099-1.0-1.002[273.8, 773.8][-9.071149, 6.490168][1.21962, 1.635403]
                            2038100-1.0-1.002[273.8, 774.1][-9.071149, 6.497527][1.626175, 4.497406]
                            2038101-1.0-1.002[273.9, 774.5][-9.06871, 6.50734][1.626186, 1.635423]
                            2038102-1.0-1.002[274.0, 774.4][-9.066271, 6.504886][null, null]
                            2038103-1.0-1.002[274.0, 773.9][-9.066271, 6.492621][null, null]
                          • Events
                            Events
                            • DataFrame (6 columns, 0 rows)
                              shape: (0, 6)
                              text_idpage_idnameonsetoffsetduration
                              i64i64stri64i64i64
                            • list (2 items)
                              • 'text_id'
                              • 'page_id'
                          • list (2 items)
                            • 'text_id'
                            • 'page_id'
                          • Experiment
                            Experiment
                            • EyeTracker
                              EyeTracker
                              • None
                                None
                              • None
                                None
                              • None
                                None
                              • None
                                None
                              • 1000
                                1000
                              • None
                                None
                              • None
                                None
                            • 1000
                              1000
                            • Screen
                              Screen
                              • 68
                                68
                              • 30.2
                                30.2
                              • 1024
                                1024
                              • 'upper left'
                                'upper left'
                              • 38
                                38
                              • 1280
                                1280
                              • 15.599386487782953
                                15.599386487782953
                              • -15.599386487782953
                                -15.599386487782953
                              • 12.508044410882546
                                12.508044410882546
                              • -12.508044410882546
                                -12.508044410882546
                        • (18 more)
                      • PosixPath('data/ToyDataset')
                        PosixPath('data/ToyDataset')
                      • DatasetPaths
                        DatasetPaths
                        • PosixPath('data/ToyDataset')
                          PosixPath('data/ToyDataset')
                        • PosixPath('data/ToyDataset/downloads')
                          PosixPath('data/ToyDataset/downloads')
                        • PosixPath('data/ToyDataset/events')
                          PosixPath('data/ToyDataset/events')
                        • PosixPath('data/ToyDataset/precomputed_events')
                          PosixPath('data/ToyDataset/precomputed_events')
                        • PosixPath
                          PosixPath('data/ToyDataset/precomputed_reading_measures')
                        • PosixPath('data/ToyDataset/preprocessed')
                          PosixPath('data/ToyDataset/preprocessed')
                        • PosixPath('data/ToyDataset/raw')
                          PosixPath('data/ToyDataset/raw')
                        • PosixPath('data/ToyDataset')
                          PosixPath('data/ToyDataset')
                      • list (0 items)
                        • list (0 items)

                          Let’s confirm again by printing all the new files in this alternative directory:

                          alternative_dirpath = dataset.path / 'preprocessed_csv'
                          print(list(alternative_dirpath.glob('*/*/*')))
                          
                          [PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_2_2.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_2_5.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_3_1.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_2_1.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_1_1.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_3_3.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_3_2.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_3_4.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_1_3.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_2_3.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_0_5.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_0_2.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_1_4.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_2_4.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_0_3.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_1_5.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_1_2.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_0_4.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_0_1.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_3_5.csv')]
                          

                          Loading#

                          Now let’s imagine that this preprocessing and saving was done in another file and we only want to load the preprocessed data.

                          We simulate this by initializing a new dataset object. We don’t need to download any additional data.

                          events_dataset = pm.Dataset('ToyDataset', path='data/ToyDataset')
                          

                          The preprocessed data can now simply be loaded by setting preprocessed to True:

                          events_dataset.load(preprocessed=True)
                          
                          events_dataset.gaze[0]
                          
                          Gaze
                          • DataFrame (8 columns, 17223 rows)
                            shape: (17_223, 8)
                            timestimuli_xstimuli_ypixelpositionvelocitytext_idpage_id
                            i64f64f64list[f64]list[f64]list[f64]i64i64
                            1988145-1.0-1.0[206.8, 152.4][-10.697598, -8.852399][null, null]01
                            1988146-1.0-1.0[206.9, 152.1][-10.695183, -8.859678][null, null]01
                            1988147-1.0-1.0[207.0, 151.8][-10.692768, -8.866956][1.610194, -5.256267]01
                            1988148-1.0-1.0[207.1, 151.7][-10.690352, -8.869381][0.402548, -4.447465]01
                            1988149-1.0-1.0[207.0, 151.5][-10.692768, -8.874233][0.402561, -3.234462]01
                            2005363-1.0-1.0[361.0, 415.4][-6.932438, -2.386672][-63.266374, -21.085616]01
                            2005364-1.0-1.0[358.0, 414.5][-7.006376, -2.408998][-63.249652, -19.431326]01
                            2005365-1.0-1.0[355.8, 413.8][-7.060582, -2.426362][-60.359624, -15.710061]01
                            2005366-1.0-1.0[353.1, 413.2][-7.12709, -2.441245][null, null]01
                            2005367-1.0-1.0[351.2, 412.9][-7.173881, -2.448686][null, null]01
                          • Events
                            Events
                            • DataFrame (6 columns, 0 rows)
                              shape: (0, 6)
                              text_idpage_idnameonsetoffsetduration
                              i64i64stri64i64i64
                            • list (2 items)
                              • 'text_id'
                              • 'page_id'
                          • list (2 items)
                            • 'text_id'
                            • 'page_id'
                          • Experiment
                            Experiment
                            • EyeTracker
                              EyeTracker
                              • None
                                None
                              • None
                                None
                              • None
                                None
                              • None
                                None
                              • 1000
                                1000
                              • None
                                None
                              • None
                                None
                            • 1000
                              1000
                            • Screen
                              Screen
                              • 68
                                68
                              • 30.2
                                30.2
                              • 1024
                                1024
                              • 'upper left'
                                'upper left'
                              • 38
                                38
                              • 1280
                                1280
                              • 15.599386487782953
                                15.599386487782953
                              • -15.599386487782953
                                -15.599386487782953
                              • 12.508044410882546
                                12.508044410882546
                              • -12.508044410882546
                                -12.508044410882546

                          By default, the preprocessed directory and the feather extension will be chosen.

                          In case of alternative directory names or other file formats you can use the following:

                          events_dataset.load(
                              preprocessed=True,
                              preprocessed_dirname='preprocessed_csv',
                              extension='csv',
                          )
                          events_dataset.gaze[0]
                          
                          Gaze
                          • DataFrame (8 columns, 17223 rows)
                            shape: (17_223, 8)
                            timestimuli_xstimuli_ytext_idpage_idpixelpositionvelocity
                            i64f64f64i64i64list[f64]list[f64]list[f64]
                            1988145-1.0-1.001[206.8, 152.4][-10.697598, -8.852399][null, null]
                            1988146-1.0-1.001[206.9, 152.1][-10.695183, -8.859678][null, null]
                            1988147-1.0-1.001[207.0, 151.8][-10.692768, -8.866956][1.610194, -5.256267]
                            1988148-1.0-1.001[207.1, 151.7][-10.690352, -8.869381][0.402548, -4.447465]
                            1988149-1.0-1.001[207.0, 151.5][-10.692768, -8.874233][0.402561, -3.234462]
                            2005363-1.0-1.001[361.0, 415.4][-6.932438, -2.386672][-63.266374, -21.085616]
                            2005364-1.0-1.001[358.0, 414.5][-7.006376, -2.408998][-63.249652, -19.431326]
                            2005365-1.0-1.001[355.8, 413.8][-7.060582, -2.426362][-60.359624, -15.710061]
                            2005366-1.0-1.001[353.1, 413.2][-7.12709, -2.441245][null, null]
                            2005367-1.0-1.001[351.2, 412.9][-7.173881, -2.448686][null, null]
                          • Events
                            Events
                            • DataFrame (6 columns, 0 rows)
                              shape: (0, 6)
                              text_idpage_idnameonsetoffsetduration
                              i64i64stri64i64i64
                            • list (2 items)
                              • 'text_id'
                              • 'page_id'
                          • list (2 items)
                            • 'text_id'
                            • 'page_id'
                          • None
                            None

                          What you have learned in this tutorial:#