Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Linked Indices

Custom xarray indexes for keeping multiple coordinates in sync across shared dimensions.

Overview

This library provides custom xarray Index implementations that automatically constrain related dimensions when you select on any one of them.

DimensionInterval

The DimensionInterval provides the ability to performantly store arbitrary intervals over a continuous coordinate. Like a multiindex but more generalized. See the comparison with MultiIndex for an understanding of the comparison.

diagram of possible sel calls for DimensionInterval

See the Multi-Interval Example for a detailed walkthrough.

NDIndex

Provides the ability to select on N-dimensional derived coordinates (like absolute time computed from trial offsets + relative time).

diagram of two possible abs-rel indexes

See the ND Coordinates Example for a detailed walkthrough covering trial-based data with both absolute and relative time coordinates.

Use Cases

Installation

pip install git+https://github.com/ianhi/xarray-linked-indexes

Quick Start

from linked_indices.example_data import multi_interval_dataset
from linked_indices import DimensionInterval

# Load example dataset with time, words, and phonemes
ds = multi_interval_dataset()

# Apply the linked index
ds = ds.drop_indexes(["time", "word", "phoneme"]).set_xindex(
    ["time", "word_intervals", "phoneme_intervals", "word", "part_of_speech", "phoneme"],
    DimensionInterval,
)
ds

Now selecting on any dimension automatically constrains all other dimensions to overlapping values:

# Select word "red" - time and phonemes are auto-constrained to [0, 40)
ds.sel(word="red")

Using onset/duration format

As an alternative to creating pd.IntervalIndex objects, you can use onset/duration coordinates directly. This is useful when your data comes from annotation tools that export onset + duration format.

ds = ds.drop_indexes(["time", "word"]).set_xindex(
    ["time", "word_onset", "word_duration", "word"],
    DimensionInterval,
    onset_duration_coords={"word": ("word_onset", "word_duration")},
)

Options:

See the Onset/Duration Example notebook for a detailed walkthrough.

Examples

See the example notebook for a detailed walkthrough of multiple interval types (words, phonemes) over a shared continuous time dimension.

API Reference

DimensionInterval

The main index class for linking multiple interval dimensions over a single continuous dimension.

Features:

Known Limitations: