events2segmentation#
- pymovements.transforms.events2segmentation(events: DataFrame, name: str, time_column: str = 'time', trial_columns: list[str] | None = None, onset_column: str = 'onset', offset_column: str = 'offset', padding: float | tuple[float, float] | None = None) Expr[source]#
Convert a list of events to a binary segmentation expression.
This function creates a boolean expression that evaluates to
Trueif a sample falls within any event interval andFalseotherwise. Events are defined with inclusive onset and inclusive offset, matching the convention used by thesegmentation2events()function.- Parameters:
events (pl.DataFrame) – Event data. Must have onset and offset columns.
name (str) – The name of the event type to use for segmentation (e.g. ‘blink’).
time_column (str) – The name of the column containing the timestamps. Default is ‘time’.
trial_columns (list[str] | None) – The names of the columns containing trial identifiers. If provided, events will only be mapped to samples with matching trial identifiers. Default is None.
onset_column (str) – The name of the column containing the onset of the event (inclusive). The values must correspond to the values in
time_column. Default is ‘onset’.offset_column (str) – The name of the column containing the offset of the event (inclusive). The values must correspond to the values in
time_column. Default is ‘offset’.padding (float | tuple[float, float] | None) – Padding to extend each event interval, in the same units as
time_column. If a single float, the same padding is applied symmetrically before and after each event. If a tuple(before, after),beforeis subtracted from the onset andafteris added to the offset. Both values must be non-negative. Default is None (no padding).
- Returns:
A boolean expression aliased to
name.- Return type:
pl.Expr
- Raises:
TypeError – If
paddingis not None, a tuple, or a number.ValueError – If
onset_columnoroffset_columnis missing from the events. If any onset is greater than its offset. If any padding value is negative.
Notes
Events are defined with inclusive onset and inclusive offset. For example, an event with onset 2 and offset 4 includes samples where the
time_columnhas values 2, 3, and 4.The onset and offset values in the
eventsDataFrame are compared directly against the values in thetime_columnof the samples DataFrame. If thetime_columncontains indices, then onsets and offsets are indices. If thetime_columncontains timestamps, then onsets and offsets are timestamps.When
paddingis specified, each event interval is extended by subtractingpad_beforefrom the onset and addingpad_afterto the offset. The padding values are in the same units as thetime_column.Warning
The offset is considered inclusive. This means that the sample with the offset value in the
time_columnis part of the event.Examples
>>> import polars as pl >>> from pymovements.transforms import events2segmentation >>> >>> events_df = pl.DataFrame( ... {'name': ['blink', 'blink', 'not_blink'], 'onset': [2, 7, 3], 'offset': [5, 9, 6]} ... ) >>> gaze_df = pl.DataFrame({'time': range(10)}) >>> >>> # Create a boolean indicator column for blinks using a Polars expression >>> gaze_df.with_columns( ... events2segmentation(events_df, name='blink') ... ) shape: (10, 2) ┌──────┬───────┐ │ time ┆ blink │ │ --- ┆ --- │ │ i64 ┆ bool │ ╞══════╪═══════╡ │ 0 ┆ false │ │ 1 ┆ false │ │ 2 ┆ true │ │ 3 ┆ true │ │ 4 ┆ true │ │ 5 ┆ true │ │ 6 ┆ false │ │ 7 ┆ true │ │ 8 ┆ true │ │ 9 ┆ true │ └──────┴───────┘ >>> # With padding to extend event intervals >>> single_event = pl.DataFrame( ... {'name': ['blink'], 'onset': [3], 'offset': [5]} ... ) >>> gaze_df.with_columns( ... events2segmentation(single_event, name='blink', padding=1) ... ) shape: (10, 2) ┌──────┬───────┐ │ time ┆ blink │ │ --- ┆ --- │ │ i64 ┆ bool │ ╞══════╪═══════╡ │ 0 ┆ false │ │ 1 ┆ false │ │ 2 ┆ true │ │ 3 ┆ true │ │ 4 ┆ true │ │ 5 ┆ true │ │ 6 ┆ true │ │ 7 ┆ false │ │ 8 ┆ false │ │ 9 ┆ false │ └──────┴───────┘ >>> # With trial columns >>> events_df = pl.DataFrame({ ... 'name': ['blink', 'blink'], ... 'onset': [2, 1], ... 'offset': [3, 3], ... 'trial': [1, 2], ... }) >>> gaze_df = pl.DataFrame({ ... 'time': pl.Series([0, 1, 2, 0, 1, 2, 3], dtype=pl.Int64), ... 'trial': [1, 1, 1, 2, 2, 2, 2], ... }) >>> gaze_df.with_columns( ... events2segmentation(events_df, name='blink', trial_columns=['trial']) ... ) shape: (7, 3) ┌──────┬───────┬───────┐ │ time ┆ trial ┆ blink │ │ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ bool │ ╞══════╪═══════╪═══════╡ │ 0 ┆ 1 ┆ false │ │ 1 ┆ 1 ┆ false │ │ 2 ┆ 1 ┆ true │ │ 0 ┆ 2 ┆ false │ │ 1 ┆ 2 ┆ true │ │ 2 ┆ 2 ┆ true │ │ 3 ┆ 2 ┆ true │ └──────┴───────┴───────┘