Skip to content

Saving strategy

el_paso.saving_strategy.SavingStrategy

Bases: ABC

Abstract base class for defining strategies to save output files with specific time intervals and variables.

Attributes:

Name Type Description
output_files list[OutputFile]

List of output files to be managed by the saving strategy.

data_standard DataStandard[StandardName]

The data standard that defines the variable naming convention.

base_data_path Path

The base path where output files will be saved.

satellite str

The name of the satellite for which data is being saved.

mission str

The name of the mission for which data is being saved.

instrument str

The name of the instrument for which data is being saved.

mag_field MagneticFieldLiteral

The magnetic field model used for saving data, if applicable.

Methods:

Name Description
get_time_intervals_to_save

Abstract method to determine the time intervals for saving data between start_time and end_time.

get_file_path

Abstract method to generate the file path for a given time interval and output file.

standardize_variable

Abstract method to standardize a variable before saving, possibly renaming or formatting it.

get_target_variables

Selects and prepares variables to be saved in the output file, optionally truncating them to a time range.

save_single_file

Saves the provided dictionary to a file in the specified format (.mat, .h5, .nc, .cdf), optionally appending data.

append_data

Abstract method to append data to an existing file; must be implemented by subclasses. All subclasses may not need it, so it is not defined in the base class.

Source code in el_paso/saving_strategy.py
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
class SavingStrategy(ABC):
    """Abstract base class for defining strategies to save output files with specific time intervals and variables.

    Attributes:
        output_files (list[OutputFile]): List of output files to be managed by the saving strategy.
        data_standard (DataStandard[StandardName]): The data standard that defines the variable naming convention.
        base_data_path (Path): The base path where output files will be saved.
        satellite (str): The name of the satellite for which data is being saved.
        mission (str): The name of the mission for which data is being saved.
        instrument (str): The name of the instrument for which data is being saved.
        mag_field (MagneticFieldLiteral): The magnetic field model used for saving data, if applicable.

    Methods:
        get_time_intervals_to_save:
            Abstract method to determine the time intervals for saving data between start_time and end_time.

        get_file_path:
            Abstract method to generate the file path for a given time interval and output file.

        standardize_variable:
            Abstract method to standardize a variable before saving, possibly renaming or formatting it.

        get_target_variables:
            Selects and prepares variables to be saved in the output file, optionally truncating them to a time range.

        save_single_file:
            Saves the provided dictionary to a file in the specified format (.mat, .h5, .nc, .cdf),
            optionally appending data.

        append_data:
            Abstract method to append data to an existing file; must be implemented by subclasses.
            All subclasses may not need it, so it is not defined in the base class.
    """

    output_files: list[OutputFile]
    data_standard: DataStandard[StandardName]
    base_data_path: Path
    satellite: str
    mission: str
    instrument: str
    mag_field: MagneticFieldLiteral

    def __repr__(self) -> str:
        cls = type(self)

        constructor_params = inspect.signature(cls.__init__).parameters

        args = []

        for name in constructor_params:
            if name == "self":
                continue

            if hasattr(self, name):
                value = getattr(self, name)
                args.append(f"{name}={value!r}")

        return f"{cls.__name__}({', '.join(args)})"

    def __str__(self) -> str:
        return self.__repr__()

    @abstractmethod
    def get_time_intervals_to_save(self, start_time: datetime, end_time: datetime) -> list[TimeInterval]:
        """Generates a list of time intervals to save between the specified start and end times.

        Args:
            start_time (datetime | None): The starting datetime for the intervals.
                                          If None, intervals may start from the earliest available time.
            end_time (datetime | None): The ending datetime for the intervals.
                                        If None, intervals may end at the latest available time.

        Returns:
            list[TimeInterval]: A list of tuples, each representing a time interval (start, end)
                                             to be saved.
        """

    @abstractmethod
    def get_file_path(self, interval_start: datetime, interval_end: datetime, output_file: OutputFile) -> Path:
        """Generates a file path for saving variables based on the provided interval and output file information.

        Args:
            interval_start (datetime): The start of the interval for which the file is being generated.
            interval_end (datetime): The end of the interval for which the file is being generated.
            output_file (OutputFile): An OutputFile containing the name of the output file,
                                      and which variables should be saved in this file.

        Returns:
            Path: The generated file path where the output data should be saved.
        """

    @abstractmethod
    def standardize_variable(
        self, variable: Variable, internal_name: InternalName, *, first_call_of_interval: bool
    ) -> Variable:
        """Standardizes the given variable according to the specified name in the file.

        Standardization may include checking of units, dimensions, and size consistency.

        Args:
            variable (Variable): The variable instance to be standardized.
            internal_name (str): The internal name of the variable, used for standardization rules.
            first_call_of_interval (bool): Flag to indicate if it is the first call of a time interval

        Returns:
            Variable: The standardized variable instance.
        """

    @abstractmethod
    def save_single_file(self, file_path: Path, dict_to_save: SavedDataDict, *, append: bool = False) -> None:
        """Saves the provided dictionary to a single file in one of the supported formats (.mat, .h5, .nc).

        Parameters:
            file_path (Path): The path where the file should be saved.
            dict_to_save (dict[str, Any]): The dictionary containing variable data and metadata to be saved.
            append (bool, optional): If True, data will be appended to existing files rather than overwriting them.
                    Defaults to False.
        """

    @abstractmethod
    def get_file_path_stem(self) -> Path:
        pass

    @abstractmethod
    def get_file_name_stem(self) -> str:
        pass

    def get_target_variables(
        self,
        output_file: OutputFile,
        variables_dict: dict[InternalName, Variable],
        time_var: Variable | None,
        start_time: datetime | None,
        end_time: datetime | None,
    ) -> dict[InternalName, Variable] | None:
        """Retrieves and processes target variables for saving based on the specified output file.

        Parameters:
            output_file (OutputFile): The output file configuration containing variable names to save.
            variables_dict (dict[str, Variable]): Dictionary mapping variable names to Variable objects.
            time_var (Variable | None): The time variable used for truncation, if applicable.
            start_time (datetime | None): The start time for truncating variables, if specified.
            end_time (datetime | None): The end time for truncating variables, if specified.

        Returns:
            dict[str, Variable] | None:
                - A dictionary of processed Variable objects keyed by their names,
                    or None if any specified variable name is not found in variables_dict.

        Notes:
            - If no variable names are specified in output_file, all variables in variables_dict are processed.
            - Variables are deep-copied before processing.
            - Each variable is standardized using the `standardize_variable` method.
            - If a requested variable name is not found, a warning is issued and None is returned.
        """
        target_variables: dict[InternalName, Variable] = {}
        first_call_of_interval = True

        # if no variables have been specified, we save all of them
        if len(output_file.names_to_save) == 0:
            for key, var in variables_dict.items():
                var_to_save = deepcopy(var)

                if start_time is not None and end_time is not None and time_var is not None:
                    var_to_save.truncate(time_var, start_time.timestamp(), end_time.timestamp())
                var_to_save = self.standardize_variable(var_to_save, key, first_call_of_interval=first_call_of_interval)
                first_call_of_interval = False

                target_variables[key] = var_to_save

            return target_variables

        for name_to_save in output_file.names_to_save:
            if name_to_save in variables_dict:
                var_to_save = deepcopy(variables_dict[name_to_save])

                if start_time is not None and end_time is not None and time_var is not None:
                    var_to_save.truncate(time_var, start_time.timestamp(), end_time.timestamp())

                var_to_save = self.standardize_variable(
                    var_to_save, name_to_save, first_call_of_interval=first_call_of_interval
                )
                first_call_of_interval = False

                target_variables[name_to_save] = var_to_save
            else:
                msg = f"Could not find target variable {name_to_save}!"
                logger.warning(msg, stacklevel=2)
                if output_file.save_incomplete:
                    target_variables[name_to_save] = Variable(original_unit=u.dimensionless_unscaled, data=np.array([]))
                else:
                    return None

        return target_variables

    def get_output_file(
        self, *, standard_name: StandardName | None = None, internal_name: InternalName | None = None
    ) -> OutputFile | None:
        if internal_name is None:
            if standard_name is None:
                msg = "Either standard_name or internal_name must be provided!"
                raise ValueError(msg)
            internal_name = self.data_standard.get_internal_name(standard_name)

        if internal_name is None:
            return None

        for output_file in self.output_files:
            if internal_name in output_file.names_to_save:
                return output_file

        return None

    def get_all_standard_names(self) -> list[StandardName]:
        all_standard_names: list[StandardName] = []

        for output_file in self.output_files:
            all_standard_names.extend(
                [self.data_standard.get_standard_name(internal_name) for internal_name in output_file.names_to_save]
            )

        return list(set(all_standard_names))

Methods:

el_paso.saving_strategy.SavingStrategy.get_file_path abstractmethod

get_file_path

Generates a file path for saving variables based on the provided interval and output file information.

Parameters:

Name Type Description Default
interval_start datetime

The start of the interval for which the file is being generated.

required
interval_end datetime

The end of the interval for which the file is being generated.

required
output_file OutputFile

An OutputFile containing the name of the output file, and which variables should be saved in this file.

required

Returns:

Name Type Description
Path Path

The generated file path where the output data should be saved.

Source code in el_paso/saving_strategy.py
126
127
128
129
130
131
132
133
134
135
136
137
138
@abstractmethod
def get_file_path(self, interval_start: datetime, interval_end: datetime, output_file: OutputFile) -> Path:
    """Generates a file path for saving variables based on the provided interval and output file information.

    Args:
        interval_start (datetime): The start of the interval for which the file is being generated.
        interval_end (datetime): The end of the interval for which the file is being generated.
        output_file (OutputFile): An OutputFile containing the name of the output file,
                                  and which variables should be saved in this file.

    Returns:
        Path: The generated file path where the output data should be saved.
    """

el_paso.saving_strategy.SavingStrategy.get_target_variables

get_target_variables

Retrieves and processes target variables for saving based on the specified output file.

Parameters:

Name Type Description Default
output_file OutputFile

The output file configuration containing variable names to save.

required
variables_dict dict[str, Variable]

Dictionary mapping variable names to Variable objects.

required
time_var Variable | None

The time variable used for truncation, if applicable.

required
start_time datetime | None

The start time for truncating variables, if specified.

required
end_time datetime | None

The end time for truncating variables, if specified.

required

Returns:

Type Description
dict[InternalName, Variable] | None

dict[str, Variable] | None: - A dictionary of processed Variable objects keyed by their names, or None if any specified variable name is not found in variables_dict.

Notes
  • If no variable names are specified in output_file, all variables in variables_dict are processed.
  • Variables are deep-copied before processing.
  • Each variable is standardized using the standardize_variable method.
  • If a requested variable name is not found, a warning is issued and None is returned.
Source code in el_paso/saving_strategy.py
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
def get_target_variables(
    self,
    output_file: OutputFile,
    variables_dict: dict[InternalName, Variable],
    time_var: Variable | None,
    start_time: datetime | None,
    end_time: datetime | None,
) -> dict[InternalName, Variable] | None:
    """Retrieves and processes target variables for saving based on the specified output file.

    Parameters:
        output_file (OutputFile): The output file configuration containing variable names to save.
        variables_dict (dict[str, Variable]): Dictionary mapping variable names to Variable objects.
        time_var (Variable | None): The time variable used for truncation, if applicable.
        start_time (datetime | None): The start time for truncating variables, if specified.
        end_time (datetime | None): The end time for truncating variables, if specified.

    Returns:
        dict[str, Variable] | None:
            - A dictionary of processed Variable objects keyed by their names,
                or None if any specified variable name is not found in variables_dict.

    Notes:
        - If no variable names are specified in output_file, all variables in variables_dict are processed.
        - Variables are deep-copied before processing.
        - Each variable is standardized using the `standardize_variable` method.
        - If a requested variable name is not found, a warning is issued and None is returned.
    """
    target_variables: dict[InternalName, Variable] = {}
    first_call_of_interval = True

    # if no variables have been specified, we save all of them
    if len(output_file.names_to_save) == 0:
        for key, var in variables_dict.items():
            var_to_save = deepcopy(var)

            if start_time is not None and end_time is not None and time_var is not None:
                var_to_save.truncate(time_var, start_time.timestamp(), end_time.timestamp())
            var_to_save = self.standardize_variable(var_to_save, key, first_call_of_interval=first_call_of_interval)
            first_call_of_interval = False

            target_variables[key] = var_to_save

        return target_variables

    for name_to_save in output_file.names_to_save:
        if name_to_save in variables_dict:
            var_to_save = deepcopy(variables_dict[name_to_save])

            if start_time is not None and end_time is not None and time_var is not None:
                var_to_save.truncate(time_var, start_time.timestamp(), end_time.timestamp())

            var_to_save = self.standardize_variable(
                var_to_save, name_to_save, first_call_of_interval=first_call_of_interval
            )
            first_call_of_interval = False

            target_variables[name_to_save] = var_to_save
        else:
            msg = f"Could not find target variable {name_to_save}!"
            logger.warning(msg, stacklevel=2)
            if output_file.save_incomplete:
                target_variables[name_to_save] = Variable(original_unit=u.dimensionless_unscaled, data=np.array([]))
            else:
                return None

    return target_variables

el_paso.saving_strategy.SavingStrategy.get_time_intervals_to_save abstractmethod

get_time_intervals_to_save

Generates a list of time intervals to save between the specified start and end times.

Parameters:

Name Type Description Default
start_time datetime | None

The starting datetime for the intervals. If None, intervals may start from the earliest available time.

required
end_time datetime | None

The ending datetime for the intervals. If None, intervals may end at the latest available time.

required

Returns:

Type Description
list[TimeInterval]

list[TimeInterval]: A list of tuples, each representing a time interval (start, end) to be saved.

Source code in el_paso/saving_strategy.py
111
112
113
114
115
116
117
118
119
120
121
122
123
124
@abstractmethod
def get_time_intervals_to_save(self, start_time: datetime, end_time: datetime) -> list[TimeInterval]:
    """Generates a list of time intervals to save between the specified start and end times.

    Args:
        start_time (datetime | None): The starting datetime for the intervals.
                                      If None, intervals may start from the earliest available time.
        end_time (datetime | None): The ending datetime for the intervals.
                                    If None, intervals may end at the latest available time.

    Returns:
        list[TimeInterval]: A list of tuples, each representing a time interval (start, end)
                                         to be saved.
    """

el_paso.saving_strategy.SavingStrategy.save_single_file abstractmethod

save_single_file

Saves the provided dictionary to a single file in one of the supported formats (.mat, .h5, .nc).

Parameters:

Name Type Description Default
file_path Path

The path where the file should be saved.

required
dict_to_save dict[str, Any]

The dictionary containing variable data and metadata to be saved.

required
append bool

If True, data will be appended to existing files rather than overwriting them. Defaults to False.

False
Source code in el_paso/saving_strategy.py
157
158
159
160
161
162
163
164
165
166
@abstractmethod
def save_single_file(self, file_path: Path, dict_to_save: SavedDataDict, *, append: bool = False) -> None:
    """Saves the provided dictionary to a single file in one of the supported formats (.mat, .h5, .nc).

    Parameters:
        file_path (Path): The path where the file should be saved.
        dict_to_save (dict[str, Any]): The dictionary containing variable data and metadata to be saved.
        append (bool, optional): If True, data will be appended to existing files rather than overwriting them.
                Defaults to False.
    """

el_paso.saving_strategy.SavingStrategy.standardize_variable abstractmethod

standardize_variable

Standardizes the given variable according to the specified name in the file.

Standardization may include checking of units, dimensions, and size consistency.

Parameters:

Name Type Description Default
variable Variable

The variable instance to be standardized.

required
internal_name str

The internal name of the variable, used for standardization rules.

required
first_call_of_interval bool

Flag to indicate if it is the first call of a time interval

required

Returns:

Name Type Description
Variable Variable

The standardized variable instance.

Source code in el_paso/saving_strategy.py
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
@abstractmethod
def standardize_variable(
    self, variable: Variable, internal_name: InternalName, *, first_call_of_interval: bool
) -> Variable:
    """Standardizes the given variable according to the specified name in the file.

    Standardization may include checking of units, dimensions, and size consistency.

    Args:
        variable (Variable): The variable instance to be standardized.
        internal_name (str): The internal name of the variable, used for standardization rules.
        first_call_of_interval (bool): Flag to indicate if it is the first call of a time interval

    Returns:
        Variable: The standardized variable instance.
    """

el_paso.saving_strategy.OutputFile

Bases: NamedTuple

Represents an output file with its name and a list of variable names to save.

Attributes:

Name Type Description
name str

The name of the output file.

names_to_save list[str]

List of variable names to be saved in the output file.

save_incomplete bool

If True, allows saving even if some variables are missing.

Source code in el_paso/saving_strategy.py
35
36
37
38
39
40
41
42
43
44
45
46
class OutputFile(NamedTuple):
    """Represents an output file with its name and a list of variable names to save.

    Attributes:
        name (str): The name of the output file.
        names_to_save (list[str]): List of variable names to be saved in the output file.
        save_incomplete (bool): If True, allows saving even if some variables are missing.
    """

    name: str
    names_to_save: list[InternalName]
    save_incomplete: bool = False