Many datasets store interval annotations as onset + duration rather than explicit interval boundaries. This notebook shows how to use DimensionInterval with onset/duration coordinates directly, without needing to construct pd.IntervalIndex objects.
Why onset/duration?¶
Common data formats like TextGrid, Praat, and many annotation tools export intervals as:
onset: when the interval startsduration: how long the interval lasts
The linked_indices library provides helper functions to convert annotation DataFrames directly to xarray coordinates with proper naming conventions.
import xarray as xr
from linked_indices import DimensionInterval, example_data
from linked_indices.example_data import (
intervals_from_dataframe,
intervals_from_long_dataframe,
)Loading annotation data¶
Annotation data typically comes as a pandas DataFrame with onset, duration, and label columns. Let’s load some example speech annotation data:
# Load example speech annotations (simulating data from Praat, TextGrid, etc.)
annotations = example_data.speech_annotations()
annotationsNotice that the annotations have gaps between them - this is common in real speech data where there are pauses between words. For example, “hello” ends at 1.7s but “world” doesn’t start until 2.1s.
Converting DataFrame to xarray coordinates¶
The intervals_from_dataframe function converts annotation DataFrames to xarray Datasets with properly named coordinates ({dim}_onset, {dim}_duration):
# Convert annotations DataFrame to xarray coordinates
word_coords = intervals_from_dataframe(annotations, dim_name="word", label_col="word")
word_coordsThe helper automatically creates:
wordas the dimension coordinate (fromlabel_col)word_onsetandword_durationas coordinates (named{dim}_onset,{dim}_duration)
Adding audio data¶
Now we can add our continuous audio signal and merge with the annotation coordinates:
# Generate a simulated audio signal
times, audio_signal = example_data.generate_audio_signal(duration=10.0)
# Create dataset by merging annotation coordinates with audio data
ds = word_coords.copy()
ds["audio"] = (("time",), audio_signal)
ds = ds.assign_coords(time=times)
dsApplying the DimensionInterval index¶
To link the time and word dimensions, apply DimensionInterval with the onset_duration_coords option mapping dimension names to (onset_coord, duration_coord) tuples:
ds = ds.drop_indexes(["time", "word"]).set_xindex(
["time", "word_onset", "word_duration", "word"],
DimensionInterval,
onset_duration_coords={"word": ("word_onset", "word_duration")},
)
dsds.xindexes["word"]ds.coord_viz()ds.coord_inspector["word"]Notice that:
The
word_onsetandword_durationcoordinates remain visibleAll coordinates are linked under a single
DimensionIntervalindexNo manual coordinate creation was needed - the helper handled naming conventions
Selecting data¶
Selection works exactly the same as with the IntervalIndex format. When you select on any dimension, all other dimensions are automatically constrained.
# Select by word label - time is automatically constrained
ds.sel(word="hello")# Select by time range - words are automatically constrained
ds.sel(time=slice(2, 5))# Select by onset value
ds.sel(word_onset=4.5)Handling gaps¶
Our word annotations have gaps between them (silence between words). Let’s see what happens when we select time in a gap:
# Time 1.8 to 2.0 is in the gap between "hello" (ends at 1.7) and "world" (starts at 2.1)
ds.sel(time=slice(1.75, 2.0))When selecting multiple words with gaps between them using isel, the time dimension spans the union of their intervals (including the gap). Here we select “hello” [0.5, 1.7) and “world” [2.1, 3.9):
# Select first two words - time spans from 0.5 to 3.9, including the gap
ds.isel(word=slice(0, 2))Multiple onset/duration dimensions¶
You can have multiple interval dimensions, each with their own onset/duration coordinates. This is common for hierarchical annotations like words and phonemes. The helper function makes it easy to convert each level:
# Load multi-level annotations (words and phonemes)
word_annotations, phoneme_annotations = example_data.multi_level_annotations()
display(word_annotations)
display(phoneme_annotations)
# Convert each DataFrame to xarray coordinates using helpers
word_ds = intervals_from_dataframe(word_annotations, dim_name="word", label_col="word")
phoneme_ds = intervals_from_dataframe(
phoneme_annotations, dim_name="phoneme", label_col="phoneme"
)# Merge annotation coordinates and add audio data
times, audio = example_data.generate_audio_signal(duration=10.0)
ds_multi = xr.merge([word_ds, phoneme_ds])
ds_multi["audio"] = (("time",), audio)
ds_multi = ds_multi.assign_coords(time=times)
# Apply index with both onset/duration mappings
ds_multi = ds_multi.drop_indexes(["time", "word", "phoneme"]).set_xindex(
[
"time",
"word_onset",
"word_duration",
"word",
"part_of_speech",
"phoneme_onset",
"phoneme_duration",
"phoneme",
],
DimensionInterval,
onset_duration_coords={
"word": ("word_onset", "word_duration"),
"phoneme": ("phoneme_onset", "phoneme_duration"),
},
)
ds_multi# Select word "hello" - both time AND phonemes are constrained
ds_multi.sel(word="hello")# Select by part of speech - finds all nouns
ds_multi.sel(part_of_speech="noun")Controlling interval closedness¶
By default, intervals are left-closed [onset, onset+duration). You can change this with the interval_closed option:
# Reload data for fresh example
annotations = example_data.speech_annotations()
times, audio = example_data.generate_audio_signal()
# Create with right-closed intervals (onset, onset+duration]
ds_right = xr.Dataset(
{"audio": (("time",), audio)},
coords={
"time": times,
"word_onset": ("word", annotations["onset"].values),
"word_duration": ("word", annotations["duration"].values),
"word": ("word", annotations["word"].values),
},
)
ds_right = ds_right.drop_indexes(["time", "word"]).set_xindex(
["time", "word_onset", "word_duration", "word"],
DimensionInterval,
onset_duration_coords={"word": ("word_onset", "word_duration")},
interval_closed="right", # Options: "left", "right", "both", "neither"
)
print("Created dataset with right-closed intervals (onset, onset+duration]")Summary¶
The onset/duration format provides a convenient way to work with interval data without manually constructing pd.IntervalIndex objects:
Load annotations as a pandas DataFrame (from TextGrid, Praat, CSV, etc.)
Convert to coordinates using
intervals_from_dataframe()orintervals_from_long_dataframe()Merge and add data - combine annotation coordinates with your continuous data
Apply the index with
onset_duration_coordsmappingSelect data - all selection operations work identically to IntervalIndex format
Helper functions¶
| Function | Use case |
|---|---|
intervals_from_dataframe() | Convert a single-event-type DataFrame |
intervals_from_long_dataframe() | Convert a multi-event-type DataFrame with category column |
Key features¶
Natural representation: Use onset + duration directly from your data files
Library helpers: Handle coordinate naming conventions automatically
Visible coordinates: onset and duration remain as regular coordinates
Full functionality: All selection operations work identically
Multiple dimensions: Support for multiple onset/duration pairs
Gap support: Non-contiguous intervals work correctly
Mixed events: Handle DataFrames with multiple event types
Handling multiple event types in one DataFrame¶
Sometimes annotation data comes as a single “long format” DataFrame with multiple event types (words, phonemes, stimuli, etc.) distinguished by a category column. The intervals_from_long_dataframe function handles this case:
# Load example mixed-event annotations
mixed_df = example_data.mixed_event_annotations()
mixed_df# Convert all event types at once
intervals_from_long_dataframe(mixed_df)# Add time/audio and apply DimensionInterval
times, audio = example_data.generate_audio_signal(duration=10.0)
interval_ds = intervals_from_long_dataframe(mixed_df)
ds_mixed = interval_ds.copy()
ds_mixed["audio"] = (("time",), audio)
ds_mixed = ds_mixed.assign_coords(time=times)
# Apply the index with all three event types
ds_mixed = ds_mixed.drop_indexes(["time", "word", "phoneme", "stimulus"]).set_xindex(
[
"time",
"word_onset",
"word_duration",
"word",
"phoneme_onset",
"phoneme_duration",
"phoneme",
"stimulus_onset",
"stimulus_duration",
"stimulus",
],
DimensionInterval,
onset_duration_coords={
"word": ("word_onset", "word_duration"),
"phoneme": ("phoneme_onset", "phoneme_duration"),
"stimulus": ("stimulus_onset", "stimulus_duration"),
},
)
ds_mixed# Selecting a stimulus constrains words and phonemes too
ds_mixed.sel(stimulus="image_A")Manual iteration for selective event types¶
If you only want some event types, you can filter and apply intervals_from_dataframe iteratively:
# Only include words and phonemes (exclude stimuli)
datasets = []
for event_type in ["word", "phoneme"]:
subset = mixed_df[mixed_df["event_type"] == event_type].drop(columns=["event_type"])
ds_subset = intervals_from_dataframe(subset, dim_name=event_type, label_col="label")
datasets.append(ds_subset)
xr.merge(datasets)