events2segmentation#

pymovements.transforms.events2segmentation(events: DataFrame, name: str, time_column: str = 'time', trial_columns: list[str] | None = None, onset_column: str = 'onset', offset_column: str = 'offset', padding: float | tuple[float, float] | None = None) → Expr[source]#

Convert a list of events to a binary segmentation expression.

This function creates a boolean expression that evaluates to True if a sample falls within any event interval and False otherwise. Events are defined with inclusive onset and inclusive offset, matching the convention used by the segmentation2events() function.

Parameters:

events (pl.DataFrame) – Event data. Must have onset and offset columns.
name (str) – The name of the event type to use for segmentation (e.g. ‘blink’).
time_column (str) – The name of the column containing the timestamps. Default is ‘time’.
trial_columns (list[str] | None) – The names of the columns containing trial identifiers. If provided, events will only be mapped to samples with matching trial identifiers. Default is None.
onset_column (str) – The name of the column containing the onset of the event (inclusive). The values must correspond to the values in time_column. Default is ‘onset’.
offset_column (str) – The name of the column containing the offset of the event (inclusive). The values must correspond to the values in time_column. Default is ‘offset’.
padding (float | tuple[float, float] | None) – Padding to extend each event interval, in the same units as time_column. If a single float, the same padding is applied symmetrically before and after each event. If a tuple (before, after), before is subtracted from the onset and after is added to the offset. Both values must be non-negative. Default is None (no padding).

Returns:

A boolean expression aliased to name.

Return type:

pl.Expr

Raises:

TypeError – If padding is not None, a tuple, or a number.
ValueError – If onset_column or offset_column is missing from the events. If any onset is greater than its offset. If any padding value is negative.

Notes

Events are defined with inclusive onset and inclusive offset. For example, an event with onset 2 and offset 4 includes samples where the time_column has values 2, 3, and 4.

The onset and offset values in the events DataFrame are compared directly against the values in the time_column of the samples DataFrame. If the time_column contains indices, then onsets and offsets are indices. If the time_column contains timestamps, then onsets and offsets are timestamps.

When padding is specified, each event interval is extended by subtracting pad_before from the onset and adding pad_after to the offset. The padding values are in the same units as the time_column.

Warning

The offset is considered inclusive. This means that the sample with the offset value in the time_column is part of the event.

Examples

>>> import polars as pl
>>> from pymovements.transforms import events2segmentation
>>>
>>> events_df = pl.DataFrame(
...     {'name': ['blink', 'blink', 'not_blink'], 'onset': [2, 7, 3], 'offset': [5, 9, 6]}
... )
>>> gaze_df = pl.DataFrame({'time': range(10)})
>>>
>>> # Create a boolean indicator column for blinks using a Polars expression
>>> gaze_df.with_columns(
...     events2segmentation(events_df, name='blink')
... )
shape: (10, 2)
┌──────┬───────┐
│ time ┆ blink │
│ ---  ┆ ---   │
│ i64  ┆ bool  │
╞══════╪═══════╡
│ 0    ┆ false │
│ 1    ┆ false │
│ 2    ┆ true  │
│ 3    ┆ true  │
│ 4    ┆ true  │
│ 5    ┆ true  │
│ 6    ┆ false │
│ 7    ┆ true  │
│ 8    ┆ true  │
│ 9    ┆ true  │
└──────┴───────┘
>>> # With padding to extend event intervals
>>> single_event = pl.DataFrame(
...     {'name': ['blink'], 'onset': [3], 'offset': [5]}
... )
>>> gaze_df.with_columns(
...     events2segmentation(single_event, name='blink', padding=1)
... )
shape: (10, 2)
┌──────┬───────┐
│ time ┆ blink │
│ ---  ┆ ---   │
│ i64  ┆ bool  │
╞══════╪═══════╡
│ 0    ┆ false │
│ 1    ┆ false │
│ 2    ┆ true  │
│ 3    ┆ true  │
│ 4    ┆ true  │
│ 5    ┆ true  │
│ 6    ┆ true  │
│ 7    ┆ false │
│ 8    ┆ false │
│ 9    ┆ false │
└──────┴───────┘
>>> # With trial columns
>>> events_df = pl.DataFrame({
...     'name': ['blink', 'blink'],
...     'onset': [2, 1],
...     'offset': [3, 3],
...     'trial': [1, 2],
... })
>>> gaze_df = pl.DataFrame({
...     'time': pl.Series([0, 1, 2, 0, 1, 2, 3], dtype=pl.Int64),
...     'trial': [1, 1, 1, 2, 2, 2, 2],
... })
>>> gaze_df.with_columns(
...     events2segmentation(events_df, name='blink', trial_columns=['trial'])
... )
shape: (7, 3)
┌──────┬───────┬───────┐
│ time ┆ trial ┆ blink │
│ ---  ┆ ---   ┆ ---   │
│ i64  ┆ i64   ┆ bool  │
╞══════╪═══════╪═══════╡
│ 0    ┆ 1     ┆ false │
│ 1    ┆ 1     ┆ false │
│ 2    ┆ 1     ┆ true  │
│ 0    ┆ 2     ┆ false │
│ 1    ┆ 2     ┆ true  │
│ 2    ┆ 2     ┆ true  │
│ 3    ┆ 2     ┆ true  │
└──────┴───────┴───────┘