Skip to content

Download

el_paso.download.download

download

Download satellite data files within a specified time range and cadence.

Examples can be found in the 'examples' and 'tutorials' folder.

Parameters:

Name Type Description Default
start_time datetime

The start of the time range for downloading files. Must be timezone-aware (UTC).

required
end_time datetime

The end of the time range for downloading files. Must be timezone-aware (UTC).

required
save_path str | Path

Directory path where downloaded files will be saved.

required
file_cadence Literal['daily', 'monthly', 'single_file']

Frequency of file downloads. - "daily": Download files for each day in the range. - "monthly": Download files for each month in the range. - "single_file": Download a single file.

required
download_url str

Base URL for downloading files.

required
file_name_stem str

Stem for the file name to be downloaded.

required
download_arguments_prefixes str

Additional arguments to prefix to the download command (used with wget). Defaults to "".

''
download_arguments_suffixes str

Additional arguments to suffix to the download command (used with wget). Defaults to "".

''
method Literal['request', 'wget']

Download method to use. Either "request" (Python requests) or "wget" (system wget). Defaults to "request".

'request'
authentication_info tuple[str, str]

Tuple of (username, password) for authentication. Defaults to ("", "").

('', '')
rename_file_name_stem str | None

If provided, rename the downloaded file to this stem. Defaults to None.

None
skip_existing bool

If True, skip downloading files that already exist. Defaults to True.

True
sort_raw_files_by_time bool
If True, creates subdirectories for each year and month (e.g., 'YYYY/MM/').
This helps organize a large number of downloaded files. Defaults to False.
False
max_threads int

Maximum number of threads used for downloading. Defaults to 4.

4

Raises:

Type Description
NotImplementedError

If "monthly" cadence or an unsupported cadence is specified.

Returns:

Type Description
None

None

Source code in el_paso/download.py
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
@timed_function()
def download(
    start_time: datetime,
    end_time: datetime,
    save_path: str | Path,
    file_cadence: Literal["daily", "monthly", "single_file"],
    download_url: str,
    file_name_stem: str,
    download_arguments_prefixes: str = "",
    download_arguments_suffixes: str = "",
    method: Literal["request", "wget", "esa_swe"] = "request",
    authentication_info: tuple[str, str] = ("", ""),
    rename_file_name_stem: str | None = None,
    *,
    sort_raw_files_by_time: bool = False,
    skip_existing: bool = True,
    max_threads: int = 4,
) -> None:
    """Download satellite data files within a specified time range and cadence.

    Examples can be found in the 'examples' and 'tutorials' folder.

    Args:
        start_time (datetime): The start of the time range for downloading files. Must be timezone-aware (UTC).
        end_time (datetime): The end of the time range for downloading files. Must be timezone-aware (UTC).
        save_path (str | Path): Directory path where downloaded files will be saved.
        file_cadence (Literal["daily", "monthly", "single_file"]): Frequency of file downloads.
            - "daily": Download files for each day in the range.
            - "monthly": Download files for each month in the range.
            - "single_file": Download a single file.
        download_url (str): Base URL for downloading files.
        file_name_stem (str): Stem for the file name to be downloaded.
        download_arguments_prefixes (str, optional): Additional arguments to prefix to the download command
                                                     (used with wget). Defaults to "".
        download_arguments_suffixes (str, optional): Additional arguments to suffix to the download command
                                                     (used with wget). Defaults to "".
        method (Literal["request", "wget"], optional): Download method to use. Either "request" (Python requests) or
                                                       "wget" (system wget). Defaults to "request".
        authentication_info (tuple[str, str], optional): Tuple of (username, password) for authentication.
                                                           Defaults to ("", "").
        rename_file_name_stem (str | None, optional): If provided, rename the downloaded file to this stem.
                                                      Defaults to None.
        skip_existing (bool, optional): If True, skip downloading files that already exist. Defaults to True.
        sort_raw_files_by_time (bool, optional):
                If True, creates subdirectories for each year and month (e.g., 'YYYY/MM/').
                This helps organize a large number of downloaded files. Defaults to False.
        max_threads (int, optional): Maximum number of threads used for downloading. Defaults to 4.

    Raises:
        NotImplementedError: If "monthly" cadence or an unsupported cadence is specified.

    Returns:
        None

    """
    start_time = enforce_utc_timezone(start_time)
    end_time = enforce_utc_timezone(end_time)

    save_path = Path(save_path)

    curr_time = start_time
    tasks = []

    while curr_time < end_time:
        next_time = _get_next_time(curr_time, file_cadence)
        next_time = end_time if next_time is None else min(next_time, end_time)

        tasks.append((curr_time, next_time))
        curr_time = next_time

    if len(tasks) > 1:
        logger.info(f"Starting parallel download with {max_threads} threads for {len(tasks)} files...")

    with ThreadPoolExecutor(max_workers=max_threads) as executor:
        future_to_time = {
            executor.submit(
                _download_single_step,
                curr_time=t_start,
                next_time=t_end,
                save_path=save_path,
                method=method,
                download_url=download_url,
                file_name_stem=file_name_stem,
                download_arguments_prefixes=download_arguments_prefixes,
                download_arguments_suffixes=download_arguments_suffixes,
                authentication_info=authentication_info,
                rename_file_name_stem=rename_file_name_stem,
                skip_existing=skip_existing,
                sort_raw_files_by_time=sort_raw_files_by_time,
            ): t_start
            for t_start, t_end in tasks
        }

        for future in as_completed(future_to_time):
            t_start = future_to_time[future]
            try:
                future.result()
            except Exception as exc:  # noqa: BLE001
                logger.warning(f"Download for date {t_start} generated an exception: {exc}")