Understanding Eye-Tracking Data#

Before preprocessing, event detection, or statistical analysis, it is important to understand what eye-tracking data look like at their most basic level and how they are structured. This section introduces the core components of eye-tracking recordings and the representations commonly used in analysis.

What are Eye-Tracking Data?#

Eye-tracking data consist of measurements of eye position over time, typically recorded at a fixed sampling frequency. Depending on the experimental setup, these measurements can be collected while participants are:

reading texts,
viewing static images,
watching videos,
or interacting with dynamic or real-world stimuli, i.e., the content presented to participants during the experiment.

A device called an eye tracker estimates the point of gaze, that is, where on a stimulus or display a participant is inferred to be looking, by measuring the relative position of the pupil and corneal reflections. During calibration, participants fixate known reference points, allowing the system to learn a mapping from eye position signals to gaze coordinates on the stimulus. In screen-based experiments, gaze coordinates are commonly expressed in pixel units, corresponding to positions on the display surface.

These initial eye tracker files contain mixed content. They typically include time-stamped eye position coordinates (raw samples); information about the gaze quickly moving or, on the contrary, slowing down at certain positions (occumotoric events); various additional eye tracker markings, including experiment metadata and vendor-provided labels (messages). These files are often referred to as “raw eye-tracking data”. However, this term is used inconsistently in the literature and sometimes is confused with “raw gaze data” or “raw samples”. Thus, these terms can indicate:

In pymovements, “raw samples” refer specifically to the lowest-level gaze time series available after import, before any other processing. Events, whether vendor-provided or detected through processing, are stored separateley. In fact, pymovements aggregates these and other data components into one comprehensible Gaze object, presented below.

The `Gaze` Object#

All loading functions in pymovements return a Gaze object. This is the central data structure used throughout the library and serves as a self-contained object for eye-tracking data and its metadata. A Gaze object bundles together multiple components of a recording, including samples, events, experiment data and more.

Gaze

samples:
DataFrame (2 columns, 4306 rows)

shape: (4_306, 2)
time pixel
i64 list[f64]
0 [206.8, 152.4]
4 [207.0, 151.5]
8 [207.6, 151.9]
12 [207.6, 152.2]
16 [207.8, 151.6]
… …
17204 [349.3, 420.0]
17208 [362.7, 418.1]
17212 [371.2, 419.0]
17216 [365.9, 417.1]
17220 [355.8, 413.8]
events:
Events
Events
- frame:
  DataFrame (4 columns, 0 rows)
  
  shape: (0, 4)
  name onset offset duration
  str i64 i64 i64
- trial_columns:
  None
metadata:
dict (0 items)
messages:
None
trial_columns:
None
experiment:
Experiment
Experiment
- eyetracker:
  EyeTracker
  EyeTracker
  - left:
    None
  - model:
    None
  - mount:
    None
  - right:
    None
  - sampling_rate:
    250.0
  - vendor:
    None
  - version:
    None
- screen:
  Screen
  Screen
  - distance_cm:
    68
  - height_cm:
    30.2
  - height_px:
    1024
  - origin:
    'upper left'
  - resolution:
    tuple (2 items)
    
    1280
    1024
  - size:
    tuple (2 items)
    
    38
    30.2
  - width_cm:
    38
  - width_px:
    1280
  - x_max_dva:
    15.599386487782953
  - x_min_dva:
    -15.599386487782953
  - y_max_dva:
    12.508044410882546
  - y_min_dva:
    -12.508044410882546

Samples: The Core Time Series#

The most important part of the Gaze object is the samples table. Each row corresponds to one recorded time point, and each column represents a signal channel, such as gaze position, pupil size, velocity, or other measurements. Internally, gaze signals can be stored in nested component columns. For example, column pixel contains the x and y pixel coordinates. Read more in Defining Gaze Components.

Events#

If available, detected or imported eye-movement events are stored separately in gaze.events. These are not raw samples but fixations, saccades, or blinks pre-calcualted by the eye-tracker or added later through processing. Read more about events in Detecting Occumotoric Events.

Experiment#

Each Gaze object can contain an associated Experiment, which defines screen geometry and sampling rate. This link is essential for interpreting the samples in physical or visual-angle units and for computing time-based measures like velocity. Read more about this in the next chapter Experiment Configuration.

Other#

Additionally, the Gaze object can contain various metadata provided during import and optional time-stamped messages from the experiment software.

Good To Know#

Coordinate systems

Depending on the experimental setup and research question, gaze data can be expressed in different coordinate systems. For instance, allocentric coordinates describe where gaze falls on the stimulus or display surface, typically in pixels or degrees of visual angle. Egocentric coordinates describe eye orientation relative to the head, often in degrees of rotation. These coordinates are more common in head-mounted or mobile eye tracking.

pymovements primarily works with stimulus-referenced coordinates but allows explicit transformations when the necessary experimental information is available.

The Optimal Pipeline

However, there is no single preprocessing pipeline or set of eye-tracking measures that is optimal for all research questions. Instead, appropriate choices depend on the experimental design, the properties of the recording device, and the quality of the data (see Inspecting Data Quality). Making these transformations explicit and transparent is therefore essential for valid, interpretable, and reproducible analysis.

What Do Eye-tracking Data Tell Us

Crucially, eye-tracking data are signals rather than direct measurements of perception or cognition. Constructs such as attention, comprehension, or cognitive processes are inferred through preprocessing, event detection, and analysis choices.

time	pixel
i64	list[f64]
0	[206.8, 152.4]
4	[207.0, 151.5]
8	[207.6, 151.9]
12	[207.6, 152.2]
16	[207.8, 151.6]
…	…
17204	[349.3, 420.0]
17208	[362.7, 418.1]
17212	[371.2, 419.0]
17216	[365.9, 417.1]
17220	[355.8, 413.8]