-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce resumable downloads with --resume-retries #12991
base: main
Are you sure you want to change the base?
Conversation
- Added —resume-retries option to allow resuming incomplete downloads - Setting —resume-retries=N allows pip to make N attempts to resume downloading, in case of dropped or timed out connections - Each resume attempt uses the values specified for —retries and —timeout internally Signed-off-by: gmargaritis <[email protected]>
16fb735
to
dbc6a64
Compare
I'm guessing the CI fails because of the new linter rules introduced in 102d818 |
Does this do rsync-style checksums? That would increase reliability. |
Signed-off-by: gmargaritis <[email protected]>
Hey @notatallshaw 👋 Is there anything that I can do to move this one forward? |
A pip maintainer needs to take up the task of reviewing it, as we're all volunteers it's a matter of finding time. I think my main concern would be the behavior when interacting with index servers that behave badly, e.g. give the wrong content length (usually 0). Your description looks good to me, but I haven't had time to look over the code yet. |
Yeah, I know how it goes, so no worries! If you need any clarifications or would like me to make changes, I'd be happy to help! |
any chances that it'll be merged soon? |
I've had an initial cursory glace at this PR and it appears to be sufficiently high quality. I've also run the functionality locally (select a large wheel to download and then disconnect my WiFi midway through the download) and it has a good UX. My main concern, although this is a ship that has probably sailed, is it would be nice for pip not to have to directly handle HTTP intricacies and leave that to a separate library. I can’t promise a full review or other maintainers will agree, but I am adding it to the 25.1 milestone for it to be tracked. |
The PR looks good, although I’m not a http expert so I can’t comment on details like status and header handling. Like @notatallshaw I wish we could leave this sort of detail to a 3rd party library, but that would be a major refactoring. Add this PR (along with cert handling, parallel downloads, etc) to the list of reasons we should consider such a refactoring, but in the meantime I’m in favour of adding this. |
There isn’t an “approve with conditions” button, but I approve this change on the basis that someone who understands http should check the header and status handling. |
Just so everyone is on the same page, I plan on re-reviewing this PR sometime this week. I'm working on prototyping some code style changes which I'll share soon. Beyond that, I'd like to review the other parts of the resuming UX. After that, I should be happy enough with this to merge it and let any other suggestions be handled at a later date. |
7e165af
to
c146e81
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, I've completed another review pass. In this review, I've taken a closer look at the code structure and the user-facing messaging.
Firstly, I pushed three commits to your branch:
-
ff2ccd2 - I didn't like how
bytes_received
was being passed absolutely everywhere. Thus, I removed it as a parameter to_write_chunks_to_file
. The method will only return how many bytes it has downloaded. It's on the calling method to keep track of how many bytes have been downloaded. I also made some other simplifications where it didn't impact readability. -
2808134 - The point of the
exceptions.py
module is to separate the error printing logic from the business logic. For that reason, I moved all of the formatting logic inside the exception. Finally, I made some changes to the error itself:- If resumes are disabled, don't even bother saying that there are 0 attempts configured
- If resumes are disabled, provide a more specific hint to enable resumes using
--resume-retries
- Don't repeat how many attempts are configured
- "File" -> "URL"
- Drop the "error" suffix from the error code, it's redundant
-
c146e81 - The "Attempting to resume" warning was reworked to be easier to for non technical users. Also, I replaced any mention of "retries" with "attempts" as retries IMO is confusing with the
--retries
flag
Please let me know if you have any issues with these changes.
Secondly, I have some larger design questions:
(As I mention at the end, these questions are not blocking. I want to discuss them, but no changes need to be made to get this PR landed.)
-
How about we rename the
--resume-retries
flag to--resume-attempts
? I know that doesn't follow the pattern set by--retries
, but--resume-retries
is IMO confusing because it sounds similar to--retries
but configures something else entirely (in addition, the former is handled by urllib3, while we handle the latter). Or because this PR actually enables download resumption and restarting (see discussion below), perhaps--download-retries
? -
Should restarts be allowed while retrying a download? My concern is that an user will be okay with pip resuming a download (perhaps even 10s of times) but wouldn't be okay with restarting the download from scratch several times (e.g., they're on a metered connection). I had a semi-private discussion with @emmatyping about this. I've decided that this is fine for now, but I want to revisit this before we make resuming the default.
More broadly, this PR has been pretty hairy to review. That's unavoidable, of course, but I think it highlights that this feature could be confusing for end-users, too.
With this PR, there are several layers to our HTTP logic. urllib3
handles connection timeouts, read timeouts, and request retrying. These correspond to the --retries
and --timeout
flags. Now, there's also our own logic to retry a HTTP download, corresponding to --resume-retries
. Note the emphasis on "retry". While reviewing this PR, I got confused by the fact resuming and restarting are both supported. This PR generally refers to both features as "resuming" but IMO this is really "download retrying" (we resume from where we left off, but restart if resuming is impossible). This is confusing, and that's why I decided to switch to use the word "attempt" as it better encapsulates both behaviours.
Realistically, this nuance is going to be hard to get across1, but I just don't want our users to have the wrong impression of the functionality. This PR doesn't have add documentation. That's fine for now, but we should add some docs before making this the default.
Anyway, that's my review. I still haven't reviewed the tests, but that's not a blocking concern.
I want to thank you again for the PR! I will admit that I've been more nit-picky than usual. That's for two reasons: A) this is the first large pip PR I've reviewed, and B) this is complicated and I want to get this right. The latter is doubly important as it's hard to change things in pip down the line. Fortunately, backwards compatibility isn't really a huge problem with download resuming, but it's still something to consider.
On that note, I do want to get this in the pip 25.1 release, so barring any major objections, I will merge this in ~probably a week or two. I'm in favour of the PR and it's good enough to land as-is. We'll get feedback once pip 25.1 is out and we can make more changes based on that feedback as needed.
Footnotes
-
I also recognize that as a maintainer/reviewer, I am more aware of the nuance and subtlety involved here than any regular pip user. It's likely that is contributing to the confusion, and the average user won't be as confused as they won't even know enough to be confused 🙂 ↩
Oh right, here are some screenshots of the new messaging: (It says "Resuming download of" here. I'd made this adjustment earlier, but I reverted the change in my last commit) The urllib3 request retrying warnings are very noisy here, but this is probably the less common scenario for retrying. I'm disconnecting the Internet entirely to trigger the retry logic. In practice, I'd imagine the download would fail, but the request/connection can be easily restored (so urllib3 won't complain). Either way, this is a pre-existing issue and should be dealt with separately. |
@ichard26 WHat's the status on this? It seems close to ready, but there are still some design questions being asked. Is release 25.1 (in a couple of weeks) still a realistic target here, or should we defer it? This is a big enough feature that I'd like to see it go in, but conversely, I really don't want it as a last-minute merge a few days before the release. |
@pfmoore we've decided that this feature should be opt-in upon release, only to be enabled by default in pip 25.2 or later once we've gotten some feedback. Thus, I'm not too worried about the exact implementation details. Those can change in a future release if needed. I'm also pretty happy with the code as-is. The sticking point I have is that I'm still not sure of the UI of resumable downloads. While the rest of the feature can be reworked in future releases, it's likely not feasible to rename a flag once released. Any thoughts @pfmoore @notatallshaw? I'm rather torn and can't make up my mind. Footnotes
|
My current feeling is that it's better to stick with a simpler UI for pip 25.1. If we get complaints, then we can consider giving the users more control later (à la Proposal:
Footnotes
|
I'm going to keep out of the design discussions. I have a bunch of other things on my plate, and not much spare time, so I don't want to add anything else. I will say, though, that I don't like the idea of changing the UI once it becomes default. That's not (in my mind) what the I'm going to say that I'd rather not have it in 25.1 unless it's complete (so assuming it's gated behind Footnotes
|
@ichard26 I'm not a fan of this proposal for a couple of reasons: Firstly, I really dislike changing how features are enabled between pip versions, it makes guides outdated quickly and it more difficult to write scripts against pip. Secondly, I mildly dislike overloading flag with multiple meanings, particularly to simplify reading the help at the cost of user control. What if user is working against an index they need to disable resumable retries but enable regular retries , will that be possible under the new scheme? In terms of having nuanced options, I think most users should be served well by good defaults, and users who really need something other than the default will learn the names and meanings of those additional options. |
Thank you all for your input and efforts! There have been various discussions about the naming convention in the past (#11180, #4796).
I fully agree, and I suggest keeping the existing implementation of Merging the behavior with |
@gmargaritis I didn't have time to follow up on this like I wanted today, so I'll just say that I have no more blocking concerns. I've dropped all design questions having thought about it more. I do think this could benefit from some documentation, but I can handle that later. Right now, what I would appreciate is that you take a look at the three commits I pushed and confirm that I didn't do anything you object to. You can see my review for an explanation of the changes. Once I have time—ideally tomorrow—I'll follow up properly and clear this for merge. Thanks for your tenacity on this PR! |
@ichard26 I've reviewed the code once more and it looks great! Thank you for implementing these improvements. If I can suggest one thing, it would be to update the help text for the Specifically, I recommend changing the help text for --retries <retries> Maximum number of retries each connection should attempt (default 5 times). to: --retries <retries> Maximum attempts to establish a new connection (default: 5). This change clarifies that Similarly, I suggest updating the text for --resume-retries <resume_retries> Maximum attempts to resume an incomplete download (default: 0). I believe this will help our users better understand the purpose of these options and how they differ. Overall, I feel confident that this PR is in great shape and will be a valuable improvement. Let me know if you need anything, I'd be happy to help! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Note: I'd like to merge this PR myself. I'll wait a little bit to give people a chance to share concerns after reading my update, but otherwise, this is good to land.)
OK, here's my promised follow-up.
First off, good news. I'm approving this PR 🎉. It's amazing to see resumable downloads finally come to fruition. This has been a long-standing point of friction for numerous pip users. Their lives will be easier because of this improvement.
Now, I want to respond to the earlier discussion and also explain what has happened since @pfmoore and @notatallshaw formally reviewed this PR.
- The code has been significantly refactored to be more concise and easier to read.1 ff2ccd2 and eed205a
- The help text for
--retries
and--resume-retries
has been updated with @gmargaritis's suggestions 616cde5 - The diagnostic error raised for an incomplete download has been reworked to easier to parse while including more information (see Introduce resumable downloads with --resume-retries #12991 (comment) for a screenshot) 2808134
- Other user messaging changes, most notably telling the user what attempt # they are on so they can get a better sense of how many retries they may need to configure (see aforementioned screenshot) c146e81
There has been a lot of changes since those reviews so I wanted to make sure everyone is on the same page to avoid future surprises.
ALSO, there is one more I want to call out: this feature will still be experimental in pip 25.1. However, the logic is (or, rather, should be) feature-ready. Once we get feedback and confirm that this feature is working out in the wild, we can flip the default number of resume retries to a non-zero value (exact value TBD). Otherwise, no extra changes are needed. I hope this alleviates your concerns @pfmoore.
In terms of having nuanced options, I think most users should be served well by good defaults, and users who really need something other than the default will learn the names and meanings of those additional options.
Having thought more about it, I've come around to this conclusion as well. I did state that I wasn't really sure earlier (I could've gone either way), but I did have some concerns. I agree that this is an advanced feature and most users won't need to care about --resume-retries
once it's turned on by default. Thus, it should get its own flag and if it's a bit technical, that's fine!
My issue with the naming is that I simply read the --resume-retries
flag totally opposite to how seemingly everyone else is reading it. I read it as "The maximum number of retries for resuming a download" which would imply that pip will attempt to resume once and bail if that's not enough by default. I realized at some point that the words should be interpreted in the other order ("The maximum number of resuming (download) retries"). I agree that --resume-retries
is probably the best name even if I still don't quite like it.2
I'm going to keep out of the design discussions.
No worries! Thanks anyway for asking questions about the code and design. Even though they may have been sparse, they let me know where I needed to put in extra consideration and thought.
I'm sorry for dragging this out. I do wish this didn't take as long as it did, but having discussed seemingly everything under the sun, I do feel confident approving this. I hope the rest of you agree with this sentiment, even if this was admittedly a lot of work.
Thank you all for your patience.
Footnotes
-
I know it's probably not recommended practice for a PR reviewer to directly push changes, but I did so because A) it was easier, and B) I expect code refactoring to be relatively uncontroversial. In addition, I fully expect that whenever this code needs to be updated, it will be me doing that thus it's important that I can (more) easily follow this code. Anyway, I am calling out these changes so if you want to take a look and feel strongly, you can object. ↩
-
Naming is hard 🙂 ↩
Awesome to hear, @ichard26! Thank you all for your efforts, this is going to be a real improvement for everyone! |
Resolves #4796
Introduced the
--resume-retries
option in order to allow resuming incomplete downloads incase of dropped or timed out connections.This option additionally uses the values specified for
--retries
and--timeout
for each resume attempt, since they are passed in the session.Used
0
as the default in order to keep backward compatibility.This PR is based on #11180