Skip to content

Loading Geomagnetic Indices and Solar Wind Parameters

el_paso.load_indices_solar_wind_parameters.load_indices_solar_wind_parameters

load_indices_solar_wind_parameters
load_indices_solar_wind_parameters
load_indices_solar_wind_parameters

Loads a variety of space weather indices and solar wind parameters.

This function fetches and processes data for several common space weather and solar wind indices, including Kp, Dst, solar wind plasma properties, and Tsyganenko model parameters (G1, G2, G3, W_params).

Data is downloaded and cached locally to a .elpaso directory in the user's home directory. The function can either return the data with its original timestamps or interpolate the data to a new set of timestamps provided by a target_time_variable.

Parameters:

Name Type Description Default
start_time datetime

The start time for the data retrieval.

required
end_time datetime

The end time for the data retrieval.

required
requested_outputs Iterable[SW_Index]

A list of space weather indices to load. Supported values are defined by the SW_Index Literal.

required
target_time_variable Variable | None

An optional ep.Variable containing the target timestamps for interpolation. If None, the raw data and its timestamps are returned.

None
w_parameter_method TsyWebsite | Calculation

The method to use for obtaining W_params. 'TsyWebsite' fetches data from the Tsyganenko website (only available until 2023), while 'Calculation' computes them from other solar wind data. Defaults to 'Calculation'.

'Calculation'

Returns:

Type Description
dict[SW_Index, tuple[Variable, Variable]] | dict[SW_Index, Variable]

dict[SW_Index, ep.Variable | tuple[ep.Variable, ep.Variable]]: A dictionary where each key is a requested index and the value is the corresponding ep.Variable object(s). If target_time_variable is provided, the value is a single ep.Variable. Otherwise, it is a tuple of (data_variable, time_variable).

Raises:

Type Description
TypeError

If requested_outputs is not an iterable of strings.

OSError

If the HOME environment variable is not set.

ValueError

If a requested output is not a supported SW_Index or if an unsupported method is requested.

FileNotFoundError

If the data file from the Tsyganenko website is not found.

Source code in el_paso/load_indices_solar_wind_parameters.py
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
def load_indices_solar_wind_parameters(
    start_time: datetime,
    end_time: datetime,
    requested_outputs: Iterable[SW_Index],
    target_time_variable: ep.Variable | None = None,
    *,
    w_parameter_method: Literal["TsyWebsite", "Calculation"] = "Calculation",
) -> dict[SW_Index, tuple[ep.Variable, ep.Variable]] | dict[SW_Index, ep.Variable]:
    """Loads a variety of space weather indices and solar wind parameters.

    This function fetches and processes data for several common space weather
    and solar wind indices, including Kp, Dst, solar wind plasma properties,
    and Tsyganenko model parameters (G1, G2, G3, W_params).

    Data is downloaded and cached locally to a `.elpaso` directory in the user's home
    directory. The function can either return the data with its original timestamps
    or interpolate the data to a new set of timestamps provided by a `target_time_variable`.

    Parameters:
        start_time (datetime): The start time for the data retrieval.
        end_time (datetime): The end time for the data retrieval.
        requested_outputs (Iterable[SW_Index]): A list of space weather indices to load.
                                                Supported values are defined by the `SW_Index` Literal.
        target_time_variable (ep.Variable | None): An optional `ep.Variable` containing the
                                                    target timestamps for interpolation. If `None`,
                                                    the raw data and its timestamps are returned.
        w_parameter_method ('TsyWebsite' | 'Calculation'): The method to use for obtaining
                                                            W_params. 'TsyWebsite' fetches data from
                                                            the Tsyganenko website (only available until 2023),
                                                            while 'Calculation' computes them from other
                                                            solar wind data. Defaults to 'Calculation'.

    Returns:
        dict[SW_Index, ep.Variable | tuple[ep.Variable, ep.Variable]]: A dictionary where each key
                                                                        is a requested index and the value
                                                                        is the corresponding `ep.Variable`
                                                                        object(s). If `target_time_variable`
                                                                        is provided, the value is a single
                                                                        `ep.Variable`. Otherwise, it is a
                                                                        tuple of `(data_variable, time_variable)`.

    Raises:
        TypeError: If `requested_outputs` is not an iterable of strings.
        OSError: If the HOME environment variable is not set.
        ValueError: If a requested output is not a supported `SW_Index` or if
                    an unsupported method is requested.
        FileNotFoundError: If the data file from the Tsyganenko website is not found.
    """
    start_time = enforce_utc_timezone(start_time)
    end_time = enforce_utc_timezone(end_time)

    if not isinstance(requested_outputs, list):
        msg = "requested_outputs must be a list of strings!"
        raise TypeError(msg)

    result_dict: dict[SW_Index, tuple[ep.Variable, ep.Variable]] | dict[SW_Index, ep.Variable] = {}

    home_path = os.getenv("HOME")
    if home_path is None:
        msg = "HOME environment variable is not set!"
        raise OSError(msg)

    base_data_path = Path(home_path) / ".elpaso"

    for requested_output in requested_outputs:
        match requested_output:
            case "Kp":
                kp_model_order: list[swvo_io.kp.KpModel] = [
                    swvo_io.kp.KpOMNI(base_data_path / "OMNI_low_res", prefer_env_var=True),
                    swvo_io.kp.KpNiemegk(base_data_path / "KpNiemegk", prefer_env_var=True),
                ]
                output_df = swvo_io.kp.read_kp_from_multiple_models(
                    start_time, end_time, model_order=kp_model_order, download=True
                )

                assert isinstance(output_df, pd.DataFrame)

                result = _create_variables_from_data_frame(
                    output_df, "kp", u.dimensionless_unscaled, target_time_variable, "previous"
                )

            case "Dst":
                output_df = swvo_io.dst.DSTOMNI(base_data_path / "OMNI_low_res", prefer_env_var=True).read(
                    start_time, end_time, download=True
                )

                result = _create_variables_from_data_frame(output_df, "dst", u.nT, target_time_variable, "linear")

            case "Pdyn":
                output_df = _cache_omni_high_res(base_data_path, start_time, end_time)
                assert isinstance(output_df, pd.DataFrame)

                output_df["pdyn"] = output_df["pdyn"].interpolate(method="spline", order=3).ffill().bfill()

                result = _create_variables_from_data_frame(output_df, "pdyn", u.nPa, target_time_variable, "linear")

            case "IMF_Bz":
                # we request two additional hours for interpolation
                output_df = _cache_omni_high_res(base_data_path, start_time, end_time)
                assert isinstance(output_df, pd.DataFrame)

                output_df["bz_gsm"] = output_df["bz_gsm"].interpolate(method="spline", order=3).ffill().bfill()

                result = _create_variables_from_data_frame(output_df, "bz_gsm", u.nT, target_time_variable, "linear")

            case "IMF_By":
                # we request two additional hours for interpolation
                output_df = _cache_omni_high_res(base_data_path, start_time, end_time)
                assert isinstance(output_df, pd.DataFrame)

                output_df["by_gsm"] = output_df["by_gsm"].interpolate(method="spline", order=3).ffill().bfill()

                result = _create_variables_from_data_frame(output_df, "by_gsm", u.nT, target_time_variable, "linear")

            case "SW_speed":
                # we request two additional hours for interpolation
                output_df = _cache_omni_high_res(base_data_path, start_time, end_time)
                assert isinstance(output_df, pd.DataFrame)

                output_df["speed"] = output_df["speed"].interpolate(method="spline", order=3).ffill().bfill()

                result = _create_variables_from_data_frame(
                    output_df,
                    "speed",
                    u.km * u.s**-1,
                    target_time_variable,
                    "linear",
                )

            case "SW_density":
                # we request two additional hours for interpolation
                output_df = _cache_omni_high_res(base_data_path, start_time, end_time)
                assert isinstance(output_df, pd.DataFrame)

                output_df["proton_density"] = (
                    output_df["proton_density"].interpolate(method="spline", order=3).ffill().bfill()
                )
                output_df["proton_density"] = output_df["proton_density"].clip(lower=0)

                result = _create_variables_from_data_frame(
                    output_df, "proton_density", u.cm**-3, target_time_variable, "linear"
                )

            case "G1":
                g1_var, time_var = _calculate_g1(start_time, end_time, target_time_variable)
                result = (g1_var, time_var) if target_time_variable is None else g1_var

            case "G2":
                g2_var, time_var = _calculate_g2(start_time, end_time, target_time_variable)
                result = (g2_var, time_var) if target_time_variable is None else g2_var

            case "G3":
                g3_var, time_var = _calculate_g3(start_time, end_time, target_time_variable)
                result = (g3_var, time_var) if target_time_variable is None else g3_var

            case "W_params":
                w_var, time_var = _get_w_parameters(start_time, end_time, target_time_variable, w_parameter_method)

                result = (w_var, time_var) if target_time_variable is None else w_var

            case _:
                msg = f"Requested invalid output: {requested_output}!"
                raise ValueError(msg)

        result_dict[requested_output] = result  # ty:ignore[invalid-assignment]

    return result_dict