Skip to main content
Ctrl+K
pymovements  documentation - Home pymovements  documentation - Home
  • User Guide
  • Tutorials
  • Datasets
  • API Reference
  • Contributing
  • Bibliography
  • GitHub
  • User Guide
  • Tutorials
  • Datasets
  • API Reference
  • Contributing
  • Bibliography
  • GitHub

Section Navigation

  • pymovements in 10 minutes
  • Downloading Public Datasets
  • Working with a Local Dataset
  • Parsing SR Research EyeLink Data
  • Plotting Gaze Data
  • Preprocessing Raw Gaze Data
  • Saving and Loading Preprocessed Data
  • Handling Gaze Events
  • Creating Synthetic Data
  • Detecting Blinks from the Pupil Signal
  • Cleaning Gaze Data During Blinks
  • How to use pymovements in R
  • Tutorials
  • Cleaning Gaze Data During Blinks

Cleaning Gaze Data During Blinks#

During blinks, the eyelid partially or fully covers the pupil, producing gaze samples that do not reflect actual eye position. These blink artifacts corrupt downstream analyses such as fixation detection, velocity computation, and saccade classification.

This notebook demonstrates how to:

  1. Load a real EyeLink dataset with blink events

  2. Visualize the raw gaze signal with blink regions highlighted

  3. Use nullify_event_samples() to remove blink artifacts (with optional padding)

  4. Visualize the cleaned result, showing which samples were nullified

import matplotlib.pyplot as plt
import numpy as np
import polars as pl

import pymovements as pm
from pymovements.gaze.io import from_asc

1. Load Real EyeLink Data#

We use the ToyDatasetEyeLink dataset, which contains monocular eye tracking data recorded at 1000 Hz using an EyeLink Portable Duo.

We first use Dataset.download() to fetch the data, then load the .asc file directly with events=True so that blink events from SBLINK/EBLINK markers are parsed.

# Download the dataset
dataset = pm.Dataset('ToyDatasetEyeLink', path='data/ToyDataset')
dataset.download()

# Load the first ASC file with events=True to parse blink events
raw_dir = dataset.paths.raw / 'pymovements-toy-dataset-eyelink-main'
asc_file = raw_dir / 'raw' / 'subject_1_session_1.asc'

gaze = from_asc(
    asc_file,
    patterns='eyelink',
    encoding='ascii',
    events=True,
)

print('Samples shape:', gaze.samples.shape)
print('Columns:', gaze.samples.columns)
gaze.samples.head()
INFO:pymovements.dataset.dataset:
        You are downloading the pymovements Toy Dataset EyeLink. Please be aware that pymovements does not
        host or distribute any dataset resources and only provides a convenient interface to
        download the public dataset resources that were published by their respective authors.

        Please cite the referenced publication if you intend to use the dataset in your research.
        
Downloading https://github.com/pymovements/pymovements-toy-dataset-eyelink/archive/refs/heads/main.zip to data/ToyDataset/downloads/pymovements-toy-dataset-eyelink.zip
Checking integrity of pymovements-toy-dataset-eyelink.zip
Extracting pymovements-toy-dataset-eyelink.zip to data/ToyDataset/raw
Extracting archive:   0%|          | 0/4 [00:00<?, ?file/s]
Extracting archive: 100%|██████████| 4/4 [00:00<00:00, 88.29file/s]

Samples shape: (128342, 3)
Columns: ['time', 'pupil', 'pixel']
shape: (5, 3)
timepupilpixel
i64f64list[f64]
2154556778.0[138.1, 132.8]
2154557778.0[138.2, 132.7]
2154558778.0[138.2, 132.3]
2154559778.0[138.1, 131.9]
2154560777.0[137.9, 131.6]

2. Inspect Blink Events#

EyeLink blink events are stored with the name blink_eyelink. Let’s look at the detected blinks and their durations.

# Show all event types in the data
print('Event types:', gaze.events.frame['name'].unique().to_list())

# Filter to blink events only
blink_events = gaze.events.frame.filter(pl.col('name') == 'blink_eyelink')
print(f'\nFound {len(blink_events)} blink events:')
blink_events
Event types: ['fixation_eyelink', 'blink_eyelink', 'saccade_eyelink']

Found 19 blink events:
shape: (19, 5)
nameeyeonsetoffsetduration
strstri64i64i64
"blink_eyelink""left"2157547215757528
"blink_eyelink""left"2159353215940451
"blink_eyelink""left"2159486215957589
"blink_eyelink""left"2165704216575955
"blink_eyelink""left"2170335217039156
……………
"blink_eyelink""left"2205859220592465
"blink_eyelink""left"2211374221144369
"blink_eyelink""left"2214206221426155
"blink_eyelink""left"2218186221825872
"blink_eyelink""left"22201712220481310

3. Visualize Raw Signal with Blink Regions#

We pick a time window that contains a few blinks and plot the raw gaze signal with blink intervals shaded in gray.

# Extract time, pixel coordinates, and pupil as arrays (before cleaning)
time_arr = gaze.samples['time'].to_numpy()
pixel_data = gaze.samples['pixel'].to_list()
x_raw = np.array([p[0] if p is not None else np.nan for p in pixel_data])
y_raw = np.array([p[1] if p is not None else np.nan for p in pixel_data])
pupil_raw = gaze.samples['pupil'].to_numpy().copy()

# Get blink onset/offset pairs
blink_onsets = blink_events['onset'].to_list()
blink_offsets = blink_events['offset'].to_list()
blink_regions = list(zip(blink_onsets, blink_offsets))

# Focus on a window around the first few blinks
window_start = blink_onsets[0] - 500
window_end = blink_offsets[2] + 500 if len(blink_onsets) > 2 else blink_offsets[-1] + 500
mask = (time_arr >= window_start) & (time_arr <= window_end)

fig, axes = plt.subplots(2, 1, figsize=(14, 6), sharex=True)

for ax, data, label, color in [
    (axes[0], x_raw, 'Gaze X (px)', 'steelblue'),
    (axes[1], y_raw, 'Gaze Y (px)', 'darkorange'),
]:
    ax.plot(time_arr[mask], data[mask], color=color, linewidth=0.8)

    for onset, offset in blink_regions:
        if onset >= window_start and onset <= window_end:
            ax.axvspan(onset, offset, alpha=0.2, color='gray')

    ax.set_ylabel(label)
    ax.grid(True, alpha=0.3)

axes[1].set_xlabel('Time (ms)')
fig.suptitle('Raw Gaze Signal with Blink Regions (gray)', fontsize=13, fontweight='bold')
plt.tight_layout()
plt.show()
../_images/1a901dbdd0425fdf86ab3bc4cdf6d8e1085708e1ba673379c7174ea3bd73de42.png

4. Apply nullify_event_samples()#

We nullify gaze samples during blink events. The padding parameter extends the cleaning window to also remove the unreliable samples immediately before and after each blink:

  • padding=10 means 10 ms of symmetric padding (same before and after)

  • padding=(20, 10) means 20 ms before and 10 ms after (asymmetric)

Asymmetric padding is useful because the onset of a blink (eyelid closing) often produces artifacts slightly before the detected blink start, while the offset (eyelid opening) artifacts resolve more quickly.

# Apply blink cleaning with the default symmetric padding of 25 ms
gaze.nullify_event_samples('blink_eyelink')

# Count how many samples were nullified
null_count = gaze.samples['pixel'].null_count()
total = gaze.samples.height

print(f'Nullified {null_count} / {total} samples ({100 * null_count / total:.1f}%)')
print('Using default padding: (25, 25) ms')
Nullified 2364 / 128342 samples (1.8%)
Using default padding: (25, 25) ms

5. Visualize Before vs. After#

We plot the same time window again, now showing which samples were nullified (red) and the cleaned signal with gaps where blink data was removed.

# Build null mask
null_mask = gaze.samples['pixel'].is_null().to_numpy()

# Extract cleaned coordinates
cleaned_pixels = gaze.samples['pixel'].to_list()
x_cleaned = np.array([p[0] if p is not None else np.nan for p in cleaned_pixels])
y_cleaned = np.array([p[1] if p is not None else np.nan for p in cleaned_pixels])

# Default padding used
padding = (25, 25)

# Compute padded blink regions for shading
padded_regions = [
    (onset - padding[0], offset + padding[1])
    for onset, offset in blink_regions
]

# Plot before vs. after in the same time window
fig, axes = plt.subplots(2, 2, figsize=(16, 7), sharex=True)

for col, (x_data, y_data, label) in enumerate([
    (x_raw, y_raw, 'Before Cleaning'),
    (x_cleaned, y_cleaned, 'After Cleaning'),
]):
    for row, (data, ylabel, color) in enumerate([
        (x_data, 'Gaze X (px)', 'steelblue'),
        (y_data, 'Gaze Y (px)', 'darkorange'),
    ]):
        ax = axes[row, col]
        ax.plot(time_arr[mask], data[mask], color=color, linewidth=0.8)

        for onset, offset in padded_regions:
            if onset >= window_start and onset <= window_end:
                ax.axvspan(onset, offset, alpha=0.12, color='red')

        # On the 'before' panel, mark nullified samples in red
        if col == 0:
            null_in_window = mask & null_mask
            ax.scatter(
                time_arr[null_in_window], data[null_in_window],
                color='red', s=8, zorder=5, label='Nullified',
            )
            ax.legend(loc='upper right', fontsize=8)

        ax.set_ylabel(ylabel)
        ax.set_title(label if row == 0 else '', fontsize=11)
        ax.grid(True, alpha=0.3)

axes[1, 0].set_xlabel('Time (ms)')
axes[1, 1].set_xlabel('Time (ms)')
fig.suptitle('Before vs. After Blink Cleaning', fontsize=13, fontweight='bold')
plt.tight_layout()
plt.show()
../_images/f65467face774c4065df7541cdd5cdf409b8e9c65d7618dfdf76162a373ed1b7.png

6. Pupil Signal During Blinks#

The pupil size signal also shows characteristic artifacts during blinks. Let’s visualize the pupil trace alongside the blink regions.

pupil_cleaned = gaze.samples['pupil'].to_numpy()
null_in_window = mask & null_mask

fig, axes = plt.subplots(1, 2, figsize=(16, 3.5), sharex=True, sharey=True)

# Before: original pupil signal with nullified samples marked in red
axes[0].plot(time_arr[mask], pupil_raw[mask], color='mediumpurple', linewidth=0.8)
axes[0].scatter(
    time_arr[null_in_window], pupil_raw[null_in_window],
    color='red', s=8, zorder=5, label='Nullified',
)
for onset, offset in padded_regions:
    if onset >= window_start and onset <= window_end:
        axes[0].axvspan(onset, offset, alpha=0.12, color='red')
axes[0].set_title('Before Cleaning', fontsize=11)
axes[0].set_ylabel('Pupil Size')
axes[0].set_xlabel('Time (ms)')
axes[0].legend(loc='upper right', fontsize=8)
axes[0].grid(True, alpha=0.3)

# After: cleaned pupil signal with gaps
axes[1].plot(time_arr[mask], pupil_cleaned[mask], color='mediumpurple', linewidth=0.8)
for onset, offset in padded_regions:
    if onset >= window_start and onset <= window_end:
        axes[1].axvspan(onset, offset, alpha=0.12, color='red')
axes[1].set_title('After Cleaning', fontsize=11)
axes[1].set_xlabel('Time (ms)')
axes[1].grid(True, alpha=0.3)

fig.suptitle('Pupil Signal: Before vs. After Blink Cleaning', fontsize=13, fontweight='bold')
plt.tight_layout()
plt.show()
../_images/ed6b2edc4ab1e49149022a98dcad0d3cb136556253cb069f2d5239e85657e0d7.png

7. Summary Statistics#

A per-blink summary of the cleaning impact.

summary_rows = []
for row in blink_events.to_dicts():
    onset = row['onset']
    offset = row['offset']
    summary_rows.append({
        'onset': onset,
        'offset': offset,
        'blink_ms': offset - onset,
        'padded_onset': onset - padding[0],
        'padded_offset': offset + padding[1],
        'padded_ms': (offset + padding[1]) - (onset - padding[0]),
    })

summary_df = pl.DataFrame(summary_rows)
print('Blink Cleaning Summary')
print('=' * 60)
print(summary_df)
print(f'\nTotal samples: {total}')
print(f'Total nullified: {null_count} ({100 * null_count / total:.1f}%)')
print(f'Remaining usable: {total - null_count} ({100 * (total - null_count) / total:.1f}%)')
Blink Cleaning Summary
============================================================
shape: (19, 6)
┌─────────┬─────────┬──────────┬──────────────┬───────────────┬───────────┐
│ onset   ┆ offset  ┆ blink_ms ┆ padded_onset ┆ padded_offset ┆ padded_ms │
│ ---     ┆ ---     ┆ ---      ┆ ---          ┆ ---           ┆ ---       │
│ i64     ┆ i64     ┆ i64      ┆ i64          ┆ i64           ┆ i64       │
╞═════════╪═════════╪══════════╪══════════════╪═══════════════╪═══════════╡
│ 2157547 ┆ 2157575 ┆ 28       ┆ 2157522      ┆ 2157600       ┆ 78        │
│ 2159353 ┆ 2159404 ┆ 51       ┆ 2159328      ┆ 2159429       ┆ 101       │
│ 2159486 ┆ 2159575 ┆ 89       ┆ 2159461      ┆ 2159600       ┆ 139       │
│ 2165704 ┆ 2165759 ┆ 55       ┆ 2165679      ┆ 2165784       ┆ 105       │
│ 2170335 ┆ 2170391 ┆ 56       ┆ 2170310      ┆ 2170416       ┆ 106       │
│ …       ┆ …       ┆ …        ┆ …            ┆ …             ┆ …         │
│ 2205859 ┆ 2205924 ┆ 65       ┆ 2205834      ┆ 2205949       ┆ 115       │
│ 2211374 ┆ 2211443 ┆ 69       ┆ 2211349      ┆ 2211468       ┆ 119       │
│ 2214206 ┆ 2214261 ┆ 55       ┆ 2214181      ┆ 2214286       ┆ 105       │
│ 2218186 ┆ 2218258 ┆ 72       ┆ 2218161      ┆ 2218283       ┆ 122       │
│ 2220171 ┆ 2220481 ┆ 310      ┆ 2220146      ┆ 2220506       ┆ 360       │
└─────────┴─────────┴──────────┴──────────────┴───────────────┴───────────┘

Total samples: 128342
Total nullified: 2364 (1.8%)
Remaining usable: 125978 (98.2%)

8. Apply to All Recordings and Inspect Blink Instances#

We clean all recordings, then plot every blink instance (with a window of context around each) so you can visually verify the cleaning.

padding_all = (25, 25)  # default
context_ms = 100  # extra ms of context before/after the padded region

# Collect all blink instances across recordings
all_blinks = []

asc_dir = dataset.paths.raw / 'pymovements-toy-dataset-eyelink-main' / 'raw'

for asc_path in sorted(asc_dir.glob('*.asc')):
    gaze_obj = from_asc(
        asc_path,
        patterns='eyelink',
        encoding='ascii',
        events=True,
    )

    blinks = gaze_obj.events.frame.filter(pl.col('name') == 'blink_eyelink')
    n_blinks = len(blinks)

    # Save raw data before cleaning
    t = gaze_obj.samples['time'].to_numpy()
    px = gaze_obj.samples['pixel'].to_list()
    x_before = np.array([p[0] if p is not None else np.nan for p in px])
    y_before = np.array([p[1] if p is not None else np.nan for p in px])
    pupil_before = gaze_obj.samples['pupil'].to_numpy().copy()

    # Apply cleaning with default padding
    gaze_obj.nullify_event_samples('blink_eyelink')
    null_mask_all = gaze_obj.samples['pixel'].is_null().to_numpy()

    null_count = null_mask_all.sum()
    total = gaze_obj.samples.height
    print(
        f'{asc_path.name}: {n_blinks} blinks, '
        f'{null_count}/{total} samples nullified ({100 * null_count / total:.1f}%)'
    )

    # Store each blink instance
    for row in blinks.to_dicts():
        onset, offset = row['onset'], row['offset']
        win_start = onset - padding_all[0] - context_ms
        win_end = offset + padding_all[1] + context_ms
        win = (t >= win_start) & (t <= win_end)

        all_blinks.append({
            'file': asc_path.stem,
            'onset': onset,
            'offset': offset,
            'duration': offset - onset,
            'time': t[win],
            'x_raw': x_before[win],
            'y_raw': y_before[win],
            'pupil_raw': pupil_before[win],
            'null_mask': null_mask_all[win],
        })

print(f'\nTotal blink instances collected: {len(all_blinks)}')
subject_1_session_1.asc: 19 blinks, 2364/128342 samples nullified (1.8%)
subject_2_session_1.asc: 8 blinks, 989/109216 samples nullified (0.9%)

Total blink instances collected: 27
# Plot all blink instances in a grid: pupil signal with nullified samples in red
n = len(all_blinks)
ncols = 5
nrows = int(np.ceil(n / ncols))

fig, axes = plt.subplots(nrows, ncols, figsize=(ncols * 3, nrows * 2.2), squeeze=False)

for idx, blink in enumerate(all_blinks):
    row, col = divmod(idx, ncols)
    ax = axes[row, col]

    t_blink = blink['time']
    pupil = blink['pupil_raw']
    nmask = blink['null_mask']

    # Plot full raw pupil trace
    ax.plot(t_blink, pupil, color='mediumpurple', linewidth=0.8)

    # Overlay nullified samples in red
    if nmask.any():
        ax.scatter(t_blink[nmask], pupil[nmask], color='red', s=6, zorder=5)

    # Shade the original blink interval in gray
    ax.axvspan(blink['onset'], blink['offset'], alpha=0.2, color='gray')

    # Shade the padded region in light red
    ax.axvspan(
        blink['onset'] - padding_all[0], blink['offset'] + padding_all[1],
        alpha=0.08, color='red',
    )

    ax.set_title(f"#{idx + 1} ({blink['duration']}ms)", fontsize=8)
    ax.tick_params(labelsize=6)
    ax.set_yticks([])

# Hide unused subplots
for idx in range(n, nrows * ncols):
    row, col = divmod(idx, ncols)
    axes[row, col].set_visible(False)

fig.suptitle(
    f'All {n} Blink Instances — Pupil Signal (gray=blink, red=nullified with padding)',
    fontsize=12, fontweight='bold',
)
plt.tight_layout()
plt.show()
../_images/16264216cca53c14f8f2e9c271b34af2e9938d4f43f2ea38c16fcc6440af0bce.png

Key Considerations#

  • Load with events=True: When using from_asc(), pass events=True to parse blink events from the EyeLink SBLINK/EBLINK markers. Without this flag, blink events are not loaded.

  • Padding values depend on your sampling rate and how your eye tracker reports blink boundaries. At 1000 Hz, 20 ms = 20 samples.

  • Clean before computing derived signals (velocity, acceleration) to prevent blink artifacts from propagating.

  • Asymmetric padding (before, after) is recommended because blink onset artifacts typically extend further than offset artifacts.

  • The time and trial columns are never nullified, preserving temporal alignment.

  • EyeLink blink events are named blink_eyelink. Other eye trackers may use different naming conventions.

previous

Detecting Blinks from the Pupil Signal

next

How to use pymovements in R

On this page
  • 1. Load Real EyeLink Data
  • 2. Inspect Blink Events
  • 3. Visualize Raw Signal with Blink Regions
  • 4. Apply nullify_event_samples()
  • 5. Visualize Before vs. After
  • 6. Pupil Signal During Blinks
  • 7. Summary Statistics
  • 8. Apply to All Recordings and Inspect Blink Instances
  • Key Considerations
Show Source

© Copyright 2022-2025 The pymovements Project Authors.

Created using Sphinx 8.2.3.

Built with the PyData Sphinx Theme 0.17.0.