events2segmentation#

pymovements.events.segmentation.events2segmentation(events: DataFrame, name: str, time_column: str = 'time', trial_columns: list[str] | None = None, onset_column: str = 'onset', offset_column: str = 'offset') Expr[source]#

Convert a list of events to a binary segmentation expression.

This function creates a boolean expression that evaluates to True if a sample falls within any event interval and False otherwise. Events are defined with inclusive onset and inclusive offset, matching the convention used by the segmentation2events() function.

Parameters:
  • events (pl.DataFrame) – Event data. Must have onset and offset columns.

  • name (str) – The name of the event type to use for segmentation (e.g. ‘blink’).

  • time_column (str) – The name of the column containing the timestamps. Default is ‘time’.

  • trial_columns (list[str] | None) – The names of the columns containing trial identifiers. If provided, events will only be mapped to samples with matching trial identifiers. Default is None.

  • onset_column (str) – The name of the column containing the onset of the event (inclusive). The values must correspond to the values in time_column. Default is ‘onset’.

  • offset_column (str) – The name of the column containing the offset of the event (inclusive). The values must correspond to the values in time_column. Default is ‘offset’.

Returns:

A boolean expression aliased to name.

Return type:

pl.Expr

Raises:

ValueError – If onset_column or offset_column is missing from the events. If any onset is greater than its offset.

Notes

Events are defined with inclusive onset and inclusive offset. For example, an event with onset 2 and offset 4 includes samples where the time_column has values 2, 3, and 4.

The onset and offset values in the events DataFrame are compared directly against the values in the time_column of the samples DataFrame. If the time_column contains indices, then onsets and offsets are indices. If the time_column contains timestamps, then onsets and offsets are timestamps.

Warning

The offset is considered inclusive. This means that the sample with the offset value in the time_column is part of the event.

Examples

>>> import polars as pl
>>> from pymovements.events import events2segmentation
>>>
>>> events_df = pl.DataFrame(
...     {'name': ['blink', 'blink', 'not_blink'], 'onset': [2, 7, 3], 'offset': [5, 9, 6]}
... )
>>> gaze_df = pl.DataFrame({'time': range(10)})
>>>
>>> # Create a boolean indicator column for blinks using a Polars expression
>>> gaze_df.with_columns(
...     events2segmentation(events_df, name='blink')
... )
shape: (10, 2)
┌──────┬───────┐
│ time ┆ blink │
│ ---  ┆ ---   │
│ i64  ┆ bool  │
╞══════╪═══════╡
│ 0    ┆ false │
│ 1    ┆ false │
│ 2    ┆ true  │
│ 3    ┆ true  │
│ 4    ┆ true  │
│ 5    ┆ true  │
│ 6    ┆ false │
│ 7    ┆ true  │
│ 8    ┆ true  │
│ 9    ┆ true  │
└──────┴───────┘
>>> # With trial columns
>>> events_df = pl.DataFrame({
...     'name': ['blink', 'blink'],
...     'onset': [2, 1],
...     'offset': [3, 3],
...     'trial': [1, 2],
... })
>>> gaze_df = pl.DataFrame({
...     'time': pl.Series([0, 1, 2, 0, 1, 2, 3], dtype=pl.Int64),
...     'trial': [1, 1, 1, 2, 2, 2, 2],
... })
>>> gaze_df.with_columns(
...     events2segmentation(events_df, name='blink', trial_columns=['trial'])
... )
shape: (7, 3)
┌──────┬───────┬───────┐
│ time ┆ trial ┆ blink │
│ ---  ┆ ---   ┆ ---   │
│ i64  ┆ i64   ┆ bool  │
╞══════╪═══════╪═══════╡
│ 0    ┆ 1     ┆ false │
│ 1    ┆ 1     ┆ false │
│ 2    ┆ 1     ┆ true  │
│ 0    ┆ 2     ┆ false │
│ 1    ┆ 2     ┆ true  │
│ 2    ┆ 2     ┆ true  │
│ 3    ┆ 2     ┆ true  │
└──────┴───────┴───────┘