CSV Files#

CSV files are flexible but require explicit column definitions so that pymovements knows how to interpret the data. Below is a toy examples of eye-tracking data samples stored in a csv file.

Hide code cell content

import polars as pl

import pymovements as pm
from pymovements.gaze.experiment import Experiment

csv_example = pl.read_csv(
    "../../examples/gaze-toy-example.csv",
)
csv_example.head(5)
shape: (5, 3)
timexy
f64f64f64
0.0206.8152.4
4.0207.0151.5
8.0207.6151.9
12.0207.6152.2
16.0207.8151.6

Time Information#

When loading gaze data with from_csv(), the column containing timestamps must be specified via the time_column parameter if it is not already named time. This column will be internally standardized to time within the resulting Gaze object to ensure a consistent temporal representation across datasets.

If time_unit is not specified (None), timestamps are assumed to be in milliseconds. Supported units are:

  • 'ms' — milliseconds (default)

  • 's' — seconds

  • 'step' — sampling steps

If time_unit='step', an Experiment definition with a sampling rate must be provided so that timestamps can be converted to milliseconds. If no time_column is provided, the data are assumed not to contain explicit timestamps. In that case, a time axis can be generated later based on the sampling rate defined in the experiment (see Experiment Configuration).

Defining Gaze Components#

Gaze signals are typically stored in separate columns of the CSV file (e.g., x_left, y_left).
The following component parameters specify how these flat columns should be grouped into structured gaze components inside the Gaze object:

Parameter

Expects

Creates Nested Column

Unit

pixel_columns

List of pixel coordinate columns

pixel

pixels (px)

position_columns

List of dva coordinate columns

position

dva (°)

velocity_columns

List of velocity component columns

velocity

dva/s or px/s

acceleration_columns

List of acceleration component columns

acceleration

dva/s² or px/s²

distance_column

Single column name

distance

cm

If a non-empty list is passed to one of the component parameters, the specified columns are merged into a single nested list column in samples.

The supported number of component columns with the expected order are:

  • 0 columns → no nested column created

  • 2 columns → monocular (x, y)

  • 4 columns → binocular (x_left, y_left, x_right, y_right)

  • 6 columns → binocular + cyclopian coordinates

Pixel vs. Position Coordinates#

You typically provide either:

  • pixel_columns — if your data are in screen pixels (common for raw exports)

  • position_columns — if your data are already converted to degrees of visual angle (dva)

If both are provided, pymovements keeps both representations, allowing you to switch between coordinate systems without recomputing. Conversions between pixel and dva coordinates require a valid Experiment with screen geometry and viewing distance.

Using an Experiment#

Providing an Experiment connects gaze samples to screen geometry and sampling rate. This enables:

  • Pixel–dva transformations

  • Velocity and acceleration computation in physical units

  • Time-step conversion when time_unit="step"

If no experiment is provided, gaze data can still be loaded, but certain transformations will not be available.

Automatic Column Detection#

While pymovements provides functionality for automatic column detection, it is still under development. Currently, the naming schemes are:

  • column name prefixes define the type of data (e.g., pixel, position)

  • column name suffixes define the component (e.g. x, y, xr, yl)

This means only column names like pixel_x, position_xr or acceleration_xa can be inferred. If the described schema fits your set-up, you can enable auto_column_detect=True.

from_csv()#

Now putting all this together, we can load our toy example from above directly into Gaze:

experiment = Experiment(
    screen_width_px=1280,
    screen_height_px=1024,
    screen_width_cm=38,
    screen_height_cm=30.2,
    distance_cm=68,
    origin='upper left',
    sampling_rate=250.0,
)

gaze = pm.gaze.from_csv(
    '../../examples/gaze-toy-example.csv',
    experiment=experiment,
    time_column='time',
    pixel_columns=['x', 'y'],
    time_unit='ms'
)

gaze.samples.head(5)
shape: (5, 2)
timepixel
i64list[f64]
0[206.8, 152.4]
4[207.0, 151.5]
8[207.6, 151.9]
12[207.6, 152.2]
16[207.8, 151.6]