Skip to content

Interpolate in time

el_paso.processing.interpolate_in_time

Functions:

el_paso.processing.interpolate_in_time.interpolate_in_time

interpolate_in_time

Interpolates one or more variables by time according to specified methods and target axis.

This function takes an original time variable and a dictionary of other variables, then interpolates these variables over a new time basis. The new basis can either be generated via a target cadence (with optional start/end times) or explicitly provided via a target time variable.

Parameters:

Name Type Description Default
time_variable Variable

The master time variable that defines the original time basis for all other variables. Its data should be in a time unit (e.g., ep.units.posixtime or ep.units.datenum).

required
variables dict[str, Variable]

A dictionary where keys are variable names (str) and values are the ep.Variable objects to be interpolated.

required
interpolation_method_dict dict[str, InterpolationMethod]

A dictionary mapping variable names (str) to interpolation method strings (e.g., "linear", "nearest"), specifying how each variable should be interpolated. If a variable is not present in this dictionary, it will be skipped.

required
target_cadence timedelta | None

Optional. A datetime.timedelta object specifying the duration of each time step for generating a regular target time axis.

None
target_time_variable Variable | None

Optional. An explicit target time variable to interpolate onto. If provided, target_cadence, start_time, and end_time are ignored.

None
start_time datetime | None

Optional. A datetime.datetime object specifying the start time for generating the target axis. If None, the start time of time_variable is used.

None
end_time datetime | None

Optional. A datetime.datetime object specifying the end time for generating the target axis. If None, the end time of time_variable is used.

None
fill_value Literal['extrapolate'] | float

Optional. The value used to fill data points outside the bounds of the original time variable. Defaults to np.nan. Can also be set to "extrapolate".

nan
max_gap_seconds float | None

Optional. The maximum allowable time gap (in seconds) between two consecutive original timestamps. Target timestamps falling within a gap larger than this value will not be interpolated and will be masked with np.nan.

None

Returns:

Type Description
Variable

ep.Variable: An ep.Variable object representing the new interpolated time axis. The

Variable

variables dictionary passed as an argument is modified in place, with

Variable

each variable's data updated to its interpolated values.

Raises:

Type Description
ValueError

If neither target_cadence nor target_time_variable is provided, or if the first dimension size of any variable's data does not match the length of the time_variable data.

TypeError

If an input data array is not a numeric type.

Source code in el_paso/processing/interpolate_in_time.py
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
@timed_function()
def interpolate_in_time(
    time_variable: ep.Variable,
    variables: dict[str, ep.Variable],
    interpolation_method_dict: dict[str, InterpolationMethod],
    target_cadence: timedelta | None = None,
    target_time_variable: ep.Variable | None = None,
    start_time: datetime | None = None,
    end_time: datetime | None = None,
    fill_value: Literal["extrapolate"] | float = np.nan,
    max_gap_seconds: float | None = None,
) -> ep.Variable:
    """Interpolates one or more variables by time according to specified methods and target axis.

    This function takes an original time variable and a dictionary of other variables, then
    interpolates these variables over a new time basis. The new basis can either be
    generated via a target cadence (with optional start/end times) or explicitly provided
    via a target time variable.

    Args:
        time_variable (ep.Variable): The master time variable that defines the original
            time basis for all other variables. Its data should be in a time
            unit (e.g., `ep.units.posixtime` or `ep.units.datenum`).
        variables (dict[str, ep.Variable]): A dictionary where keys are variable names (str) and values
            are the `ep.Variable` objects to be interpolated.
        interpolation_method_dict (dict[str, InterpolationMethod]): A dictionary mapping variable names (str) to
            interpolation method strings (e.g., "linear", "nearest"), specifying how each variable should be
            interpolated. If a variable is not present in this dictionary, it will be skipped.
        target_cadence (timedelta | None): Optional. A `datetime.timedelta` object specifying the
            duration of each time step for generating a regular target time axis.
        target_time_variable (ep.Variable | None): Optional. An explicit target time variable
            to interpolate onto. If provided, `target_cadence`, `start_time`, and `end_time` are ignored.
        start_time (datetime | None): Optional. A `datetime.datetime` object specifying the
            start time for generating the target axis. If None, the start time of `time_variable`
            is used.
        end_time (datetime | None): Optional. A `datetime.datetime` object specifying the end
            time for generating the target axis. If None, the end time of `time_variable` is used.
        fill_value (Literal["extrapolate"] | float): Optional. The value used to fill data points outside the bounds
            of the original time variable. Defaults to `np.nan`. Can also be set to `"extrapolate"`.
        max_gap_seconds (float | None): Optional. The maximum allowable time gap (in seconds) between two
            consecutive original timestamps. Target timestamps falling within a gap larger than this
            value will not be interpolated and will be masked with `np.nan`.

    Returns:
        ep.Variable: An `ep.Variable` object representing the new interpolated time axis. The
        `variables` dictionary passed as an argument is modified in place, with
        each variable's data updated to its interpolated values.

    Raises:
        ValueError: If neither `target_cadence` nor `target_time_variable` is provided,
            or if the first dimension size of any variable's data does not match the
            length of the `time_variable` data.
        TypeError: If an input data array is not a numeric type.
    """
    logger = logging.getLogger(__name__)
    logger.info("Interpolating by time...")

    if target_cadence is None and target_time_variable is None:
        msg = "Either target_cadence or target_time_variable must be provided!"
        raise ValueError(msg)

    # Determine target time coordinates
    if target_time_variable is not None:
        target_timestamps = target_time_variable.get_data(ep.units.posixtime).astype(np.float64)
        new_time_var = target_time_variable
    else:
        start_time = start_time or datenum_to_datetime(time_variable.get_data(ep.units.datenum)[0])
        end_time = end_time or datenum_to_datetime(time_variable.get_data(ep.units.datenum)[-1])

        if target_cadence is None:
            msg = "target_cadence must be specified if target_time_variable is None!"
            raise ValueError(msg)

        target_timestamps = np.arange(start_time.timestamp(), end_time.timestamp(), target_cadence.total_seconds())
        new_time_var = ep.Variable(data=target_timestamps, original_unit=ep.units.posixtime)
        new_time_var.metadata.add_processing_note("Created while time interpolating.")

    original_cadence = float(np.nanmedian(np.diff(time_variable.get_data(ep.units.posixtime))))
    timestamps = time_variable.get_data(ep.units.posixtime).astype(np.float64)

    # Pre-calculate reusable search indices and exact match conditions for gaps
    idx = np.searchsorted(timestamps, target_timestamps)
    in_bounds = (idx > 0) & (idx < len(timestamps))
    exact_match_right = (idx < len(timestamps)) & (
        target_timestamps == timestamps[np.minimum(idx, len(timestamps) - 1)]
    )
    exact_match_left = (idx > 0) & (target_timestamps == timestamps[idx - 1])

    # Pre-calculate the max_gap_seconds mask
    max_gap_mask = None
    if max_gap_seconds is not None:
        gaps = np.zeros_like(target_timestamps, dtype=float)
        gaps[in_bounds] = timestamps[idx[in_bounds]] - timestamps[idx[in_bounds] - 1]
        is_large_gap = in_bounds & (gaps > max_gap_seconds)
        max_gap_mask = is_large_gap & ~exact_match_right & ~exact_match_left

    for key, var in variables.items():
        if key not in interpolation_method_dict:
            continue

        # Check if time variable and data content sizes match
        if var.get_data().shape[0] != len(timestamps):
            msg = f"Variable {key}: size of dimension 0 does not match length of time variable!"
            raise ValueError(msg)

        old_data = var.get_data()
        if not np.issubdtype(old_data.dtype, np.number):
            msg = f"Interpolation (method: {interpolation_method_dict[key]}) is only supported for numeric types!"
            raise TypeError(msg)

        # Identify NaN rows along the time axis (axis=0)
        if old_data.ndim > 1:
            is_nan_time = np.any(np.isnan(old_data), axis=tuple(range(1, old_data.ndim)))
        else:
            is_nan_time = np.isnan(old_data)

        # Generate a per-variable mask to isolate target timestamps falling inside an original NaN gap
        is_nan_gap = None
        if np.any(is_nan_time):
            left_is_nan = np.zeros_like(target_timestamps, dtype=bool)
            right_is_nan = np.zeros_like(target_timestamps, dtype=bool)

            left_is_nan[idx > 0] = is_nan_time[idx[idx > 0] - 1]
            right_is_nan[idx < len(timestamps)] = is_nan_time[
                np.minimum(idx[idx < len(timestamps)], len(timestamps) - 1)
            ]

            # Target timestamps are inside a NaN gap if bounded by an original NaN point
            is_nan_gap = in_bounds & (left_is_nan | right_is_nan)
            # Protect target items that land perfectly on an original valid measurement
            is_nan_gap = is_nan_gap & ~(exact_match_right & ~right_is_nan) & ~(exact_match_left & ~left_is_nan)

        # Interpolate using only valid data pairs to prevent spline corruption or matrix failures
        if np.any(is_nan_time):
            valid_mask = ~is_nan_time
            if not np.any(valid_mask):
                interpolated_data = np.full((len(target_timestamps), *old_data.shape[1:]), np.nan)
                f = None
            else:
                f = interp1d(
                    timestamps[valid_mask],
                    old_data[valid_mask, ...],
                    kind=interpolation_method_dict[key],
                    axis=0,
                    bounds_error=False,
                    fill_value=fill_value,
                )
        else:
            f = interp1d(
                timestamps,
                old_data,
                kind=interpolation_method_dict[key],
                axis=0,
                bounds_error=False,
                fill_value=fill_value,
            )

        if f is not None:
            interpolated_data = f(target_timestamps)

        # Enforce maximum permissible data gaps
        if max_gap_mask is not None:
            interpolated_data[max_gap_mask, ...] = np.nan

        # Enforce NaN gap rules (no bridging across missing measurements)
        if is_nan_gap is not None:
            interpolated_data[is_nan_gap, ...] = np.nan

        if interpolated_data.shape[0] != len(target_timestamps):
            msg = "Encountered shape missmatch after time interpolation!"
            raise ValueError(msg)

        # Update data and metadata in-place
        var.set_data(np.array(interpolated_data), "same")
        var.metadata.original_cadence_seconds = original_cadence

        cadence_desc = (
            f"cadence of {target_cadence.total_seconds() / 60} minutes" if target_cadence else "custom target time axis"
        )
        var.metadata.add_processing_note(
            f"Time interpolated with method '{interpolation_method_dict[key]}' to {cadence_desc}"
        )

    return new_time_var