Retry and resume functionality for downloader by mtauraso · Pull Request #17 · lincc-frameworks/hyrax

mtauraso · 2024-08-13T20:33:37Z

Implementation of retry and resume within downloadCutout.py.

Retry means that when a request fails we will try again (defaults to 3 attempts). This is intended to address connection drops and the like. We use configurable exponential backoff to avoid a thundering herd if load from our client is causing some backend failure.

Resume describes the situation where the download fails for unrecoverable reasons (HSC infra goes down) or is terminated (e.g. downloads only occur at night). This generates a resume_download.toml file in the download directory, which allows the exact same download to resume from the chunk that was in progress when the interruption occurred. This resume functionality is off by default to preserve the download() interface used by downloadCutout.py's CLI which does not support resume.

- Resume has been tested - Retry has not been tested

github-actions · 2024-08-13T20:39:51Z

Before [`b66d834`]	After [`7aaee8d`]	Ratio	Benchmark (Parameter)
4.12±1s	1.88±1s	~0.46	benchmarks.time_computation
96	2.39k	24.92	benchmarks.mem_list

Click here to view all benchmarks.

codecov · 2024-08-13T20:45:38Z

Codecov Report

Attention: Patch coverage is 0% with 1 line in your changes missing coverage. Please review.

Project coverage is 39.45%. Comparing base (b66d834) to head (fff2c17).
Report is 2 commits behind head on main.

Files	Patch %	Lines
src/fibad/download.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main      #17   +/-   ##
=======================================
  Coverage   39.45%   39.45%           
=======================================
  Files          10       10           
  Lines         185      185           
=======================================
  Hits           73       73           
  Misses        112      112

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

aritraghsh09

The changes look good to me, and I have no objections. But I have two comments:-

The files that fail even after the specified number of attempts -- can we keep a running list of those object ids somewhere and then dump them as a .npy array or something similar?
Alternatively, I guess a separate check-script can be written that verifies whether an object_id.fits exists for each object_id in the download table; and then makes an array of all the object_ids for which image files were not found.

drewoldag · 2024-08-14T21:57:21Z

src/fibad/downloadCutout/downloadCutout.py

+    """
+    # Load resume data so we start at the appropriate chunk.
+    if not os.path.exists(resume_data_filename):
+        return 0


Was this a RuntimeError before? Seems like it makes sense for it to be a raised exception, but curious if there's a good reason for it to be return 0.

Yeah, the reason for this is so if we're called with resume=True but there is no resume data, we will just download from the beginning.

This avoids a CLI flag (or other mechanism) in download.py or higher which has to know whether the user intends to resume or not.

Maybe resume should really be called resume_if_possible since that is what it really means.

oh, ok, I see what you mean here. I don't think I have a strong opinion here other than perhaps cleaning up the docstring and perhaps leaving a comment like "can't find the file, so starting from index 0" or something.

drewoldag

This looks pretty good. Only one little comment about return 0 vs. raise Exception.

Initial Downloader with rety and resume functionality.

a3a4958

- Resume has been tested - Retry has not been tested

Fixups to Retry

fff2c17

mtauraso requested review from aritraghsh09 and drewoldag August 13, 2024 20:51

mtauraso self-assigned this Aug 13, 2024

mtauraso changed the title ~~Initial Downloader with rety and resume functionality.~~ Retry and resume functionality for downloader Aug 13, 2024

mtauraso marked this pull request as ready for review August 13, 2024 20:51

aritraghsh09 approved these changes Aug 14, 2024

View reviewed changes

drewoldag reviewed Aug 14, 2024

View reviewed changes

drewoldag approved these changes Aug 14, 2024

View reviewed changes

mtauraso merged commit 7bfb7db into main Aug 14, 2024

mtauraso deleted the downloader-retry branch August 14, 2024 22:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retry and resume functionality for downloader#17

Retry and resume functionality for downloader#17
mtauraso merged 2 commits intomainfrom
downloader-retry

mtauraso commented Aug 13, 2024

Uh oh!

github-actions bot commented Aug 13, 2024 •

edited

Loading

Uh oh!

codecov bot commented Aug 13, 2024 •

edited

Loading

Uh oh!

aritraghsh09 left a comment

Uh oh!

drewoldag Aug 14, 2024

Uh oh!

mtauraso Aug 14, 2024 •

edited

Loading

Uh oh!

drewoldag Aug 14, 2024

Uh oh!

drewoldag left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mtauraso commented Aug 13, 2024

Uh oh!

github-actions bot commented Aug 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Aug 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

aritraghsh09 left a comment

Choose a reason for hiding this comment

Uh oh!

drewoldag Aug 14, 2024

Choose a reason for hiding this comment

Uh oh!

mtauraso Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

drewoldag Aug 14, 2024

Choose a reason for hiding this comment

Uh oh!

drewoldag left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Aug 13, 2024 •

edited

Loading

codecov bot commented Aug 13, 2024 •

edited

Loading

mtauraso Aug 14, 2024 •

edited

Loading