Saving and Loading Preprocessed Data#
What you will learn in this tutorial:#
how to save your preprocessed data
how to load your preprocessed data
Preparations#
We import pymovements as the alias pm for convenience.
import pymovements as pm
Let’s start by downloading our ToyDataset and loading in its data:
dataset = pm.Dataset('ToyDataset', path='data/ToyDataset')
dataset.download()
dataset.load()
INFO:pymovements.dataset.dataset:
You are downloading the pymovements Toy Dataset. Please be aware that pymovements does not
host or distribute any dataset resources and only provides a convenient interface to
download the public dataset resources that were published by their respective authors.
Please cite the referenced publication if you intend to use the dataset in your research.
Using already downloaded and verified file: data/ToyDataset/downloads/pymovements-toy-dataset.zip
Extracting pymovements-toy-dataset.zip to data/ToyDataset/raw
Extracting archive: 0%| | 0/23 [00:00<?, ?file/s]
Extracting archive: 100%|██████████| 23/23 [00:00<00:00, 261.69file/s]
-
DatasetDefinitionDatasetDefinition
-
name:
'ToyDataset'
-
long_name:
'pymovements Toy Dataset'
-
'Example toy dataset. This dataset includes monocu...''Example toy dataset.\n\nThis dataset includes monocular eye tracking data from a single participant in a single\nsession. Eye movements are recorded at a sampling frequency of 1000 Hz using an EyeLink Portable\nDuo video-based eye tracker and are provided as pixel coordinates.\n\nThe participant is instructed to read 4 texts with 5 screens each.\n'
-
ExperimentExperiment
-
EyeTrackerEyeTracker
-
left:
None
-
model:
None
-
mount:
None
-
right:
None
-
sampling_rate:
1000
-
vendor:
None
-
version:
None
-
left:
-
ScreenScreen
-
distance_cm:
68
-
height_cm:
30.2
-
height_px:
1024
-
origin:
'upper left'
-
width_cm:
38
-
width_px:
1280
-
x_max_dva:
15.599386487782953
-
x_min_dva:
-15.599386487782953
-
y_max_dva:
12.508044410882546
-
y_min_dva:
-12.508044410882546
-
distance_cm:
-
-
list (1 items)
-
ResourceDefinition
-
content:
'gaze'
-
filename:
'pymovements-toy-dataset.zip'
-
filename_pattern:
'trial_{text_id:d}_{page_id:d}.csv'
-
dict (2 items)
-
text_id:
<class 'int'>
-
page_id:
<class 'int'>
-
text_id:
-
load_function:
None
-
dict (4 items)
-
time_column:
'timestamp'
-
time_unit:
'ms'
- (2 more)
-
time_column:
-
md5:
'256901852c1c07581d375eef705855d6'
-
mirrors:
None
-
WebSourceWebSource(url='https://github.com/pymovements/pymovements-toy-dataset/archive/refs/heads/main.zip', filename='pymovements-toy-dataset.zip', md5='256901852c1c07581d375eef705855d6', mirrors=None)
-
'https://github.com/pymovements/pymovements-toy-dat...''https://github.com/pymovements/pymovements-toy-dataset/archive/refs/heads/main.zip'
-
content:
-
ResourceDefinition
-
name:
-
tuple (20 items)
-
Events
-
DataFrame (4 columns, 0 rows)shape: (0, 4)
name onset offset duration str i64 i64 i64 -
trial_columns:
None
-
-
Events
-
DataFrame (4 columns, 0 rows)shape: (0, 4)
name onset offset duration str i64 i64 i64 -
trial_columns:
None
-
- (18 more)
-
Events
-
dict (1 items)
-
DataFrame (3 columns, 20 rows)shape: (20, 3)
text_id page_id filepath i64 i64 str 0 1 "pymovements-toy-dataset-main/d… 0 2 "pymovements-toy-dataset-main/d… 0 3 "pymovements-toy-dataset-main/d… 0 4 "pymovements-toy-dataset-main/d… 0 5 "pymovements-toy-dataset-main/d… … … … 3 1 "pymovements-toy-dataset-main/d… 3 2 "pymovements-toy-dataset-main/d… 3 3 "pymovements-toy-dataset-main/d… 3 4 "pymovements-toy-dataset-main/d… 3 5 "pymovements-toy-dataset-main/d…
-
-
list (20 items)
-
Gaze
-
DataFrame (4 columns, 17223 rows)shape: (17_223, 4)
time stimuli_x stimuli_y pixel i64 f64 f64 list[f64] 1988145 -1.0 -1.0 [206.8, 152.4] 1988146 -1.0 -1.0 [206.9, 152.1] 1988147 -1.0 -1.0 [207.0, 151.8] 1988148 -1.0 -1.0 [207.1, 151.7] 1988149 -1.0 -1.0 [207.0, 151.5] … … … … 2005363 -1.0 -1.0 [361.0, 415.4] 2005364 -1.0 -1.0 [358.0, 414.5] 2005365 -1.0 -1.0 [355.8, 413.8] 2005366 -1.0 -1.0 [353.1, 413.2] 2005367 -1.0 -1.0 [351.2, 412.9] -
EventsEvents
-
DataFrame (4 columns, 0 rows)shape: (0, 4)
name onset offset duration str i64 i64 i64 -
trial_columns:
None
-
-
trial_columns:
None
-
ExperimentExperiment
-
EyeTrackerEyeTracker
-
left:
None
-
model:
None
-
mount:
None
-
right:
None
-
sampling_rate:
1000
-
vendor:
None
-
version:
None
-
left:
-
ScreenScreen
-
distance_cm:
68
-
height_cm:
30.2
-
height_px:
1024
-
origin:
'upper left'
-
width_cm:
38
-
width_px:
1280
-
x_max_dva:
15.599386487782953
-
x_min_dva:
-15.599386487782953
-
y_max_dva:
12.508044410882546
-
y_min_dva:
-12.508044410882546
-
distance_cm:
-
-
-
Gaze
-
DataFrame (4 columns, 29799 rows)shape: (29_799, 4)
time stimuli_x stimuli_y pixel i64 f64 f64 list[f64] 2008305 -1.0 -1.0 [141.4, 153.6] 2008306 -1.0 -1.0 [141.1, 153.2] 2008307 -1.0 -1.0 [140.7, 152.8] 2008308 -1.0 -1.0 [140.6, 152.7] 2008309 -1.0 -1.0 [140.5, 152.6] … … … … 2038099 -1.0 -1.0 [273.8, 773.8] 2038100 -1.0 -1.0 [273.8, 774.1] 2038101 -1.0 -1.0 [273.9, 774.5] 2038102 -1.0 -1.0 [274.0, 774.4] 2038103 -1.0 -1.0 [274.0, 773.9] -
EventsEvents
-
DataFrame (4 columns, 0 rows)shape: (0, 4)
name onset offset duration str i64 i64 i64 -
trial_columns:
None
-
-
trial_columns:
None
-
ExperimentExperiment
-
EyeTrackerEyeTracker
-
left:
None
-
model:
None
-
mount:
None
-
right:
None
-
sampling_rate:
1000
-
vendor:
None
-
version:
None
-
left:
-
ScreenScreen
-
distance_cm:
68
-
height_cm:
30.2
-
height_px:
1024
-
origin:
'upper left'
-
width_cm:
38
-
width_px:
1280
-
x_max_dva:
15.599386487782953
-
x_min_dva:
-15.599386487782953
-
y_max_dva:
12.508044410882546
-
y_min_dva:
-12.508044410882546
-
distance_cm:
-
-
- (18 more)
-
Gaze
-
path:
PosixPath('data/ToyDataset')
-
DatasetPathsDatasetPaths
-
dataset:
PosixPath('data/ToyDataset')
-
downloads:
PosixPath('data/ToyDataset/downloads')
-
events:
PosixPath('data/ToyDataset/events')
-
precomputed_events:
PosixPath('data/ToyDataset/precomputed_events')
-
precomputed_reading_measures:
PosixPath('data/ToyDataset/precomputed_reading_measures')
-
preprocessed:
PosixPath('data/ToyDataset/preprocessed')
-
raw:
PosixPath('data/ToyDataset/raw')
-
root:
PosixPath('data/ToyDataset')
-
stimuli:
PosixPath('data/ToyDataset/stimuli')
-
dataset:
-
precomputed_events:
list (0 items)
-
precomputed_reading_measures:
list (0 items)
-
stimuli:
list (0 items)
Now let’s load in the data and do some preprocessing:
dataset.pix2deg()
dataset.pos2vel()
dataset.gaze[0]
-
DataFrame (6 columns, 17223 rows)shape: (17_223, 6)
time stimuli_x stimuli_y pixel position velocity i64 f64 f64 list[f64] list[f64] list[f64] 1988145 -1.0 -1.0 [206.8, 152.4] [-10.697598, -8.852399] [null, null] 1988146 -1.0 -1.0 [206.9, 152.1] [-10.695183, -8.859678] [null, null] 1988147 -1.0 -1.0 [207.0, 151.8] [-10.692768, -8.866956] [1.610194, -5.256267] 1988148 -1.0 -1.0 [207.1, 151.7] [-10.690352, -8.869381] [0.402548, -4.447465] 1988149 -1.0 -1.0 [207.0, 151.5] [-10.692768, -8.874233] [0.402561, -3.234462] … … … … … … 2005363 -1.0 -1.0 [361.0, 415.4] [-6.932438, -2.386672] [-63.266374, -21.085616] 2005364 -1.0 -1.0 [358.0, 414.5] [-7.006376, -2.408998] [-63.249652, -19.431326] 2005365 -1.0 -1.0 [355.8, 413.8] [-7.060582, -2.426362] [-60.359624, -15.710061] 2005366 -1.0 -1.0 [353.1, 413.2] [-7.12709, -2.441245] [null, null] 2005367 -1.0 -1.0 [351.2, 412.9] [-7.173881, -2.448686] [null, null] -
EventsEvents
-
DataFrame (4 columns, 0 rows)shape: (0, 4)
name onset offset duration str i64 i64 i64 -
trial_columns:
None
-
-
trial_columns:
None
-
ExperimentExperiment
-
EyeTrackerEyeTracker
-
left:
None
-
model:
None
-
mount:
None
-
right:
None
-
sampling_rate:
1000
-
vendor:
None
-
version:
None
-
left:
-
ScreenScreen
-
distance_cm:
68
-
height_cm:
30.2
-
height_px:
1024
-
origin:
'upper left'
-
width_cm:
38
-
width_px:
1280
-
x_max_dva:
15.599386487782953
-
x_min_dva:
-15.599386487782953
-
y_max_dva:
12.508044410882546
-
y_min_dva:
-12.508044410882546
-
distance_cm:
-
We have now added some additional columns for degrees in visual angle and velocity.
Saving#
Saving your preprocessed data is as simple as:
dataset.save_preprocessed()
-
DatasetDefinitionDatasetDefinition
-
name:
'ToyDataset'
-
long_name:
'pymovements Toy Dataset'
-
'Example toy dataset. This dataset includes monocu...''Example toy dataset.\n\nThis dataset includes monocular eye tracking data from a single participant in a single\nsession. Eye movements are recorded at a sampling frequency of 1000 Hz using an EyeLink Portable\nDuo video-based eye tracker and are provided as pixel coordinates.\n\nThe participant is instructed to read 4 texts with 5 screens each.\n'
-
ExperimentExperiment
-
EyeTrackerEyeTracker
-
left:
None
-
model:
None
-
mount:
None
-
right:
None
-
sampling_rate:
1000
-
vendor:
None
-
version:
None
-
left:
-
ScreenScreen
-
distance_cm:
68
-
height_cm:
30.2
-
height_px:
1024
-
origin:
'upper left'
-
width_cm:
38
-
width_px:
1280
-
x_max_dva:
15.599386487782953
-
x_min_dva:
-15.599386487782953
-
y_max_dva:
12.508044410882546
-
y_min_dva:
-12.508044410882546
-
distance_cm:
-
-
list (1 items)
-
ResourceDefinition
-
content:
'gaze'
-
filename:
'pymovements-toy-dataset.zip'
-
filename_pattern:
'trial_{text_id:d}_{page_id:d}.csv'
-
dict (2 items)
-
text_id:
<class 'int'>
-
page_id:
<class 'int'>
-
text_id:
-
load_function:
None
-
dict (4 items)
-
time_column:
'timestamp'
-
time_unit:
'ms'
- (2 more)
-
time_column:
-
md5:
'256901852c1c07581d375eef705855d6'
-
mirrors:
None
-
WebSourceWebSource(url='https://github.com/pymovements/pymovements-toy-dataset/archive/refs/heads/main.zip', filename='pymovements-toy-dataset.zip', md5='256901852c1c07581d375eef705855d6', mirrors=None)
-
'https://github.com/pymovements/pymovements-toy-dat...''https://github.com/pymovements/pymovements-toy-dataset/archive/refs/heads/main.zip'
-
content:
-
ResourceDefinition
-
name:
-
tuple (20 items)
-
Events
-
DataFrame (4 columns, 0 rows)shape: (0, 4)
name onset offset duration str i64 i64 i64 -
trial_columns:
None
-
-
Events
-
DataFrame (4 columns, 0 rows)shape: (0, 4)
name onset offset duration str i64 i64 i64 -
trial_columns:
None
-
- (18 more)
-
Events
-
dict (1 items)
-
DataFrame (3 columns, 20 rows)shape: (20, 3)
text_id page_id filepath i64 i64 str 0 1 "pymovements-toy-dataset-main/d… 0 2 "pymovements-toy-dataset-main/d… 0 3 "pymovements-toy-dataset-main/d… 0 4 "pymovements-toy-dataset-main/d… 0 5 "pymovements-toy-dataset-main/d… … … … 3 1 "pymovements-toy-dataset-main/d… 3 2 "pymovements-toy-dataset-main/d… 3 3 "pymovements-toy-dataset-main/d… 3 4 "pymovements-toy-dataset-main/d… 3 5 "pymovements-toy-dataset-main/d…
-
-
list (20 items)
-
Gaze
-
DataFrame (6 columns, 17223 rows)shape: (17_223, 6)
time stimuli_x stimuli_y pixel position velocity i64 f64 f64 list[f64] list[f64] list[f64] 1988145 -1.0 -1.0 [206.8, 152.4] [-10.697598, -8.852399] [null, null] 1988146 -1.0 -1.0 [206.9, 152.1] [-10.695183, -8.859678] [null, null] 1988147 -1.0 -1.0 [207.0, 151.8] [-10.692768, -8.866956] [1.610194, -5.256267] 1988148 -1.0 -1.0 [207.1, 151.7] [-10.690352, -8.869381] [0.402548, -4.447465] 1988149 -1.0 -1.0 [207.0, 151.5] [-10.692768, -8.874233] [0.402561, -3.234462] … … … … … … 2005363 -1.0 -1.0 [361.0, 415.4] [-6.932438, -2.386672] [-63.266374, -21.085616] 2005364 -1.0 -1.0 [358.0, 414.5] [-7.006376, -2.408998] [-63.249652, -19.431326] 2005365 -1.0 -1.0 [355.8, 413.8] [-7.060582, -2.426362] [-60.359624, -15.710061] 2005366 -1.0 -1.0 [353.1, 413.2] [-7.12709, -2.441245] [null, null] 2005367 -1.0 -1.0 [351.2, 412.9] [-7.173881, -2.448686] [null, null] -
EventsEvents
-
DataFrame (4 columns, 0 rows)shape: (0, 4)
name onset offset duration str i64 i64 i64 -
trial_columns:
None
-
-
trial_columns:
None
-
ExperimentExperiment
-
EyeTrackerEyeTracker
-
left:
None
-
model:
None
-
mount:
None
-
right:
None
-
sampling_rate:
1000
-
vendor:
None
-
version:
None
-
left:
-
ScreenScreen
-
distance_cm:
68
-
height_cm:
30.2
-
height_px:
1024
-
origin:
'upper left'
-
width_cm:
38
-
width_px:
1280
-
x_max_dva:
15.599386487782953
-
x_min_dva:
-15.599386487782953
-
y_max_dva:
12.508044410882546
-
y_min_dva:
-12.508044410882546
-
distance_cm:
-
-
-
Gaze
-
DataFrame (6 columns, 29799 rows)shape: (29_799, 6)
time stimuli_x stimuli_y pixel position velocity i64 f64 f64 list[f64] list[f64] list[f64] 2008305 -1.0 -1.0 [141.4, 153.6] [-12.268583, -8.823284] [null, null] 2008306 -1.0 -1.0 [141.1, 153.2] [-12.275749, -8.832989] [null, null] 2008307 -1.0 -1.0 [140.7, 152.8] [-12.285302, -8.842695] [-5.572617, -6.065816] 2008308 -1.0 -1.0 [140.6, 152.7] [-12.28769, -8.845121] [-3.582268, -4.043733] 2008309 -1.0 -1.0 [140.5, 152.6] [-12.290078, -8.847547] [-2.388085, -2.021821] … … … … … … 2038099 -1.0 -1.0 [273.8, 773.8] [-9.071149, 6.490168] [1.21962, 1.635403] 2038100 -1.0 -1.0 [273.8, 774.1] [-9.071149, 6.497527] [1.626175, 4.497406] 2038101 -1.0 -1.0 [273.9, 774.5] [-9.06871, 6.50734] [1.626186, 1.635423] 2038102 -1.0 -1.0 [274.0, 774.4] [-9.066271, 6.504886] [null, null] 2038103 -1.0 -1.0 [274.0, 773.9] [-9.066271, 6.492621] [null, null] -
EventsEvents
-
DataFrame (4 columns, 0 rows)shape: (0, 4)
name onset offset duration str i64 i64 i64 -
trial_columns:
None
-
-
trial_columns:
None
-
ExperimentExperiment
-
EyeTrackerEyeTracker
-
left:
None
-
model:
None
-
mount:
None
-
right:
None
-
sampling_rate:
1000
-
vendor:
None
-
version:
None
-
left:
-
ScreenScreen
-
distance_cm:
68
-
height_cm:
30.2
-
height_px:
1024
-
origin:
'upper left'
-
width_cm:
38
-
width_px:
1280
-
x_max_dva:
15.599386487782953
-
x_min_dva:
-15.599386487782953
-
y_max_dva:
12.508044410882546
-
y_min_dva:
-12.508044410882546
-
distance_cm:
-
-
- (18 more)
-
Gaze
-
path:
PosixPath('data/ToyDataset')
-
DatasetPathsDatasetPaths
-
dataset:
PosixPath('data/ToyDataset')
-
downloads:
PosixPath('data/ToyDataset/downloads')
-
events:
PosixPath('data/ToyDataset/events')
-
precomputed_events:
PosixPath('data/ToyDataset/precomputed_events')
-
precomputed_reading_measures:
PosixPath('data/ToyDataset/precomputed_reading_measures')
-
preprocessed:
PosixPath('data/ToyDataset/preprocessed')
-
raw:
PosixPath('data/ToyDataset/raw')
-
root:
PosixPath('data/ToyDataset')
-
stimuli:
PosixPath('data/ToyDataset/stimuli')
-
dataset:
-
precomputed_events:
list (0 items)
-
precomputed_reading_measures:
list (0 items)
-
stimuli:
list (0 items)
All of the preprocessed data is saved into this directory:
dataset.paths.preprocessed
PosixPath('data/ToyDataset/preprocessed')
Let’s confirm it by printing all the new files in this directory:
print(list(dataset.paths.preprocessed.glob('*/*/*')))
[PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_3_5.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_3_4.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_0_1.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_2_3.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_2_4.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_1_2.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_0_4.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_0_3.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_0_5.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_1_4.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_2_1.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_3_1.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_3_2.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_2_2.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_1_3.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_0_2.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_2_5.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_1_5.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_3_3.feather'), PosixPath('data/ToyDataset/preprocessed/pymovements-toy-dataset-main/data/trial_1_1.feather')]
All of the files have been saved into the Dataset.paths.preprocessed as feather files.
If we want to save the data into an alternative directory and also use a different file format like csv we can use the following:
dataset.save_preprocessed(preprocessed_dirname='preprocessed_csv', extension='csv')
-
DatasetDefinitionDatasetDefinition
-
name:
'ToyDataset'
-
long_name:
'pymovements Toy Dataset'
-
'Example toy dataset. This dataset includes monocu...''Example toy dataset.\n\nThis dataset includes monocular eye tracking data from a single participant in a single\nsession. Eye movements are recorded at a sampling frequency of 1000 Hz using an EyeLink Portable\nDuo video-based eye tracker and are provided as pixel coordinates.\n\nThe participant is instructed to read 4 texts with 5 screens each.\n'
-
ExperimentExperiment
-
EyeTrackerEyeTracker
-
left:
None
-
model:
None
-
mount:
None
-
right:
None
-
sampling_rate:
1000
-
vendor:
None
-
version:
None
-
left:
-
ScreenScreen
-
distance_cm:
68
-
height_cm:
30.2
-
height_px:
1024
-
origin:
'upper left'
-
width_cm:
38
-
width_px:
1280
-
x_max_dva:
15.599386487782953
-
x_min_dva:
-15.599386487782953
-
y_max_dva:
12.508044410882546
-
y_min_dva:
-12.508044410882546
-
distance_cm:
-
-
list (1 items)
-
ResourceDefinition
-
content:
'gaze'
-
filename:
'pymovements-toy-dataset.zip'
-
filename_pattern:
'trial_{text_id:d}_{page_id:d}.csv'
-
dict (2 items)
-
text_id:
<class 'int'>
-
page_id:
<class 'int'>
-
text_id:
-
load_function:
None
-
dict (4 items)
-
time_column:
'timestamp'
-
time_unit:
'ms'
- (2 more)
-
time_column:
-
md5:
'256901852c1c07581d375eef705855d6'
-
mirrors:
None
-
WebSourceWebSource(url='https://github.com/pymovements/pymovements-toy-dataset/archive/refs/heads/main.zip', filename='pymovements-toy-dataset.zip', md5='256901852c1c07581d375eef705855d6', mirrors=None)
-
'https://github.com/pymovements/pymovements-toy-dat...''https://github.com/pymovements/pymovements-toy-dataset/archive/refs/heads/main.zip'
-
content:
-
ResourceDefinition
-
name:
-
tuple (20 items)
-
Events
-
DataFrame (4 columns, 0 rows)shape: (0, 4)
name onset offset duration str i64 i64 i64 -
trial_columns:
None
-
-
Events
-
DataFrame (4 columns, 0 rows)shape: (0, 4)
name onset offset duration str i64 i64 i64 -
trial_columns:
None
-
- (18 more)
-
Events
-
dict (1 items)
-
DataFrame (3 columns, 20 rows)shape: (20, 3)
text_id page_id filepath i64 i64 str 0 1 "pymovements-toy-dataset-main/d… 0 2 "pymovements-toy-dataset-main/d… 0 3 "pymovements-toy-dataset-main/d… 0 4 "pymovements-toy-dataset-main/d… 0 5 "pymovements-toy-dataset-main/d… … … … 3 1 "pymovements-toy-dataset-main/d… 3 2 "pymovements-toy-dataset-main/d… 3 3 "pymovements-toy-dataset-main/d… 3 4 "pymovements-toy-dataset-main/d… 3 5 "pymovements-toy-dataset-main/d…
-
-
list (20 items)
-
Gaze
-
DataFrame (6 columns, 17223 rows)shape: (17_223, 6)
time stimuli_x stimuli_y pixel position velocity i64 f64 f64 list[f64] list[f64] list[f64] 1988145 -1.0 -1.0 [206.8, 152.4] [-10.697598, -8.852399] [null, null] 1988146 -1.0 -1.0 [206.9, 152.1] [-10.695183, -8.859678] [null, null] 1988147 -1.0 -1.0 [207.0, 151.8] [-10.692768, -8.866956] [1.610194, -5.256267] 1988148 -1.0 -1.0 [207.1, 151.7] [-10.690352, -8.869381] [0.402548, -4.447465] 1988149 -1.0 -1.0 [207.0, 151.5] [-10.692768, -8.874233] [0.402561, -3.234462] … … … … … … 2005363 -1.0 -1.0 [361.0, 415.4] [-6.932438, -2.386672] [-63.266374, -21.085616] 2005364 -1.0 -1.0 [358.0, 414.5] [-7.006376, -2.408998] [-63.249652, -19.431326] 2005365 -1.0 -1.0 [355.8, 413.8] [-7.060582, -2.426362] [-60.359624, -15.710061] 2005366 -1.0 -1.0 [353.1, 413.2] [-7.12709, -2.441245] [null, null] 2005367 -1.0 -1.0 [351.2, 412.9] [-7.173881, -2.448686] [null, null] -
EventsEvents
-
DataFrame (4 columns, 0 rows)shape: (0, 4)
name onset offset duration str i64 i64 i64 -
trial_columns:
None
-
-
trial_columns:
None
-
ExperimentExperiment
-
EyeTrackerEyeTracker
-
left:
None
-
model:
None
-
mount:
None
-
right:
None
-
sampling_rate:
1000
-
vendor:
None
-
version:
None
-
left:
-
ScreenScreen
-
distance_cm:
68
-
height_cm:
30.2
-
height_px:
1024
-
origin:
'upper left'
-
width_cm:
38
-
width_px:
1280
-
x_max_dva:
15.599386487782953
-
x_min_dva:
-15.599386487782953
-
y_max_dva:
12.508044410882546
-
y_min_dva:
-12.508044410882546
-
distance_cm:
-
-
-
Gaze
-
DataFrame (6 columns, 29799 rows)shape: (29_799, 6)
time stimuli_x stimuli_y pixel position velocity i64 f64 f64 list[f64] list[f64] list[f64] 2008305 -1.0 -1.0 [141.4, 153.6] [-12.268583, -8.823284] [null, null] 2008306 -1.0 -1.0 [141.1, 153.2] [-12.275749, -8.832989] [null, null] 2008307 -1.0 -1.0 [140.7, 152.8] [-12.285302, -8.842695] [-5.572617, -6.065816] 2008308 -1.0 -1.0 [140.6, 152.7] [-12.28769, -8.845121] [-3.582268, -4.043733] 2008309 -1.0 -1.0 [140.5, 152.6] [-12.290078, -8.847547] [-2.388085, -2.021821] … … … … … … 2038099 -1.0 -1.0 [273.8, 773.8] [-9.071149, 6.490168] [1.21962, 1.635403] 2038100 -1.0 -1.0 [273.8, 774.1] [-9.071149, 6.497527] [1.626175, 4.497406] 2038101 -1.0 -1.0 [273.9, 774.5] [-9.06871, 6.50734] [1.626186, 1.635423] 2038102 -1.0 -1.0 [274.0, 774.4] [-9.066271, 6.504886] [null, null] 2038103 -1.0 -1.0 [274.0, 773.9] [-9.066271, 6.492621] [null, null] -
EventsEvents
-
DataFrame (4 columns, 0 rows)shape: (0, 4)
name onset offset duration str i64 i64 i64 -
trial_columns:
None
-
-
trial_columns:
None
-
ExperimentExperiment
-
EyeTrackerEyeTracker
-
left:
None
-
model:
None
-
mount:
None
-
right:
None
-
sampling_rate:
1000
-
vendor:
None
-
version:
None
-
left:
-
ScreenScreen
-
distance_cm:
68
-
height_cm:
30.2
-
height_px:
1024
-
origin:
'upper left'
-
width_cm:
38
-
width_px:
1280
-
x_max_dva:
15.599386487782953
-
x_min_dva:
-15.599386487782953
-
y_max_dva:
12.508044410882546
-
y_min_dva:
-12.508044410882546
-
distance_cm:
-
-
- (18 more)
-
Gaze
-
path:
PosixPath('data/ToyDataset')
-
DatasetPathsDatasetPaths
-
dataset:
PosixPath('data/ToyDataset')
-
downloads:
PosixPath('data/ToyDataset/downloads')
-
events:
PosixPath('data/ToyDataset/events')
-
precomputed_events:
PosixPath('data/ToyDataset/precomputed_events')
-
precomputed_reading_measures:
PosixPath('data/ToyDataset/precomputed_reading_measures')
-
preprocessed:
PosixPath('data/ToyDataset/preprocessed')
-
raw:
PosixPath('data/ToyDataset/raw')
-
root:
PosixPath('data/ToyDataset')
-
stimuli:
PosixPath('data/ToyDataset/stimuli')
-
dataset:
-
precomputed_events:
list (0 items)
-
precomputed_reading_measures:
list (0 items)
-
stimuli:
list (0 items)
Let’s confirm again by printing all the new files in this alternative directory:
alternative_dirpath = dataset.path / 'preprocessed_csv'
print(list(alternative_dirpath.glob('*/*/*')))
[PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_3_1.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_2_3.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_1_3.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_0_4.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_2_5.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_3_4.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_0_5.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_1_4.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_0_3.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_3_3.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_3_5.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_2_1.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_1_1.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_2_4.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_0_2.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_3_2.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_1_5.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_0_1.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_1_2.csv'), PosixPath('data/ToyDataset/preprocessed_csv/pymovements-toy-dataset-main/data/trial_2_2.csv')]
Loading#
Now let’s imagine that this preprocessing and saving was done in another file, and we only want to load the preprocessed data.
We simulate this by initializing a new dataset object. We don’t need to download any additional data.
events_dataset = pm.Dataset('ToyDataset', path='data/ToyDataset')
The preprocessed data can now simply be loaded by setting preprocessed to True:
events_dataset.load(preprocessed=True)
events_dataset.gaze[0]
-
DataFrame (6 columns, 17223 rows)shape: (17_223, 6)
time stimuli_x stimuli_y pixel position velocity i64 f64 f64 list[f64] list[f64] list[f64] 1988145 -1.0 -1.0 [206.8, 152.4] [-10.697598, -8.852399] [null, null] 1988146 -1.0 -1.0 [206.9, 152.1] [-10.695183, -8.859678] [null, null] 1988147 -1.0 -1.0 [207.0, 151.8] [-10.692768, -8.866956] [1.610194, -5.256267] 1988148 -1.0 -1.0 [207.1, 151.7] [-10.690352, -8.869381] [0.402548, -4.447465] 1988149 -1.0 -1.0 [207.0, 151.5] [-10.692768, -8.874233] [0.402561, -3.234462] … … … … … … 2005363 -1.0 -1.0 [361.0, 415.4] [-6.932438, -2.386672] [-63.266374, -21.085616] 2005364 -1.0 -1.0 [358.0, 414.5] [-7.006376, -2.408998] [-63.249652, -19.431326] 2005365 -1.0 -1.0 [355.8, 413.8] [-7.060582, -2.426362] [-60.359624, -15.710061] 2005366 -1.0 -1.0 [353.1, 413.2] [-7.12709, -2.441245] [null, null] 2005367 -1.0 -1.0 [351.2, 412.9] [-7.173881, -2.448686] [null, null] -
EventsEvents
-
DataFrame (4 columns, 0 rows)shape: (0, 4)
name onset offset duration str i64 i64 i64 -
trial_columns:
None
-
-
trial_columns:
None
-
ExperimentExperiment
-
EyeTrackerEyeTracker
-
left:
None
-
model:
None
-
mount:
None
-
right:
None
-
sampling_rate:
1000
-
vendor:
None
-
version:
None
-
left:
-
ScreenScreen
-
distance_cm:
68
-
height_cm:
30.2
-
height_px:
1024
-
origin:
'upper left'
-
width_cm:
38
-
width_px:
1280
-
x_max_dva:
15.599386487782953
-
x_min_dva:
-15.599386487782953
-
y_max_dva:
12.508044410882546
-
y_min_dva:
-12.508044410882546
-
distance_cm:
-
By default, the preprocessed directory and the feather extension will be chosen.
In the case of alternative directory names or other file formats, you can use the following:
events_dataset.load(
preprocessed=True,
preprocessed_dirname='preprocessed_csv',
extension='csv',
)
events_dataset.gaze[0]
-
DataFrame (6 columns, 17223 rows)shape: (17_223, 6)
time stimuli_x stimuli_y pixel position velocity i64 f64 f64 list[f64] list[f64] list[f64] 1988145 -1.0 -1.0 [206.8, 152.4] [-10.697598, -8.852399] [null, null] 1988146 -1.0 -1.0 [206.9, 152.1] [-10.695183, -8.859678] [null, null] 1988147 -1.0 -1.0 [207.0, 151.8] [-10.692768, -8.866956] [1.610194, -5.256267] 1988148 -1.0 -1.0 [207.1, 151.7] [-10.690352, -8.869381] [0.402548, -4.447465] 1988149 -1.0 -1.0 [207.0, 151.5] [-10.692768, -8.874233] [0.402561, -3.234462] … … … … … … 2005363 -1.0 -1.0 [361.0, 415.4] [-6.932438, -2.386672] [-63.266374, -21.085616] 2005364 -1.0 -1.0 [358.0, 414.5] [-7.006376, -2.408998] [-63.249652, -19.431326] 2005365 -1.0 -1.0 [355.8, 413.8] [-7.060582, -2.426362] [-60.359624, -15.710061] 2005366 -1.0 -1.0 [353.1, 413.2] [-7.12709, -2.441245] [null, null] 2005367 -1.0 -1.0 [351.2, 412.9] [-7.173881, -2.448686] [null, null] -
EventsEvents
-
DataFrame (4 columns, 0 rows)shape: (0, 4)
name onset offset duration str i64 i64 i64 -
trial_columns:
None
-
-
trial_columns:
None
-
experiment:
None
What you have learned in this tutorial:#
saving your preprocesed data using
Dataset.save_preprocessed()load your preprocesed data using
Dataset.load(preprocessed=True)using custom directory names by specifying
preprocessed_dirnameusing other file formats than the default
featherformat by specifyingextension