Skip to content

resolving TUF target name from distribution download URL #6

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jku opened this issue Jul 17, 2020 · 3 comments
Closed

resolving TUF target name from distribution download URL #6

jku opened this issue Jul 17, 2020 · 3 comments
Labels
API This issue relates to Warehouse client API

Comments

@jku
Copy link
Owner

jku commented Jul 17, 2020

A design goal is to minimize the required client configuration. In practice I'm hoping I won't have to store package base directory ('packages/' on files.pythonhosted.org) in the configuration.

The plan is to integrate tuf into pip in a place where we get a Link object which contains among other things the full url of the file to be downloaded and helper properties for parsing it. The issue is how to extract the TUF metadata name from the url?

Example URL:

https://files.pythonhosted.org/packages/ca/ab/5e004afa025a6fb640c6e983d4983e6507421ff01be224da79ab7de7a21f/Django-3.0.8-py3-none-any.whl#sha256=5457fc953ec560c5521b41fad9e6734a4668b7ba205832191bbdff40ec61073c

We want to extract

ca/ab/5e004afa025a6fb640c6e983d4983e6507421ff01be224da79ab7de7a21f/Django-3.0.8-py3-none-any.whl
  • With knowledge of base package directory this is easy... Should warehouse include that info in custom metadata? or can we just assume it's always "packages/"?
  • Another option is to define that Metadata name is the filename without fragments with enough preceding path components to form the hash: this assumes we know how long the hash is (either warehouse must to tell us or we are not future proof for hash length changes)
  • Alternatively define that Metadata name is the filename without fragments with 3 preceding directories -- this is not great for future proofing (but I have no idea if the three directory levels could in practice become too few in the future)
@jku jku added the API This issue relates to Warehouse client API label Jul 17, 2020
@jku jku changed the title Define TUF target name resolution resolving TUF target name from download URL Jul 17, 2020
@jku jku changed the title resolving TUF target name from download URL resolving TUF target name from distribution download URL Aug 6, 2020
@woodruffw
Copy link

  • Metadata name is the filename without fragments with enough preceding path components to form the hash: this assumes we know how long the hash is (either warehouse must to tell us or we are not future proof for hash length changes)

This sounds reasonable to me -- it's unlikely that Warehouse will be moving away from BLAKE2b anytime soon, so this could just be a constant that's baked into pip. Then again, it certainly would make migrating a pain, should that ever need to occur. But I expect that such a migration would require a total turnover of the package index links anyways, so perhaps that's not a big deal.

@jku
Copy link
Owner Author

jku commented Aug 14, 2020

Then again, it certainly would make migrating a pain, should that ever need to occur. But I expect that such a migration would require a total turnover of the package index links anyways, so perhaps that's not a big deal.

In the very unlikely event of hash change happening, the worst outcome is that clients that did not upgrade in time would have to use "--disable-package-security" (or whatever it will be called) once to upgrade pip... so maybe this is reasonable

For now, I'll work with the assumption that Metadata name is the filename without fragments with enough preceding path components to form a 256-bit blake2b hash. Thanks for reply.

@jku
Copy link
Owner Author

jku commented Oct 2, 2020

Current implementation is metadata name is filename plus 3 preceding directory names -- but there is a sanity check to ensure the result has the correct length.

I'm closing this but keeping a note to mention it in review

@jku jku closed this as completed Oct 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API This issue relates to Warehouse client API
Projects
None yet
Development

No branches or pull requests

2 participants