Skip to content

URL matching #1098

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 31, 2020
Merged

URL matching #1098

merged 6 commits into from
Jul 31, 2020

Conversation

tomchristie
Copy link
Member

@tomchristie tomchristie commented Jul 29, 2020

This pull request refactors our URL matching for proxy lookups.

It adds a utility class URLMatcher that handles matching against the proxy keys that we currently support.

It's best explained through some examples...

>>> from httpx._utils import URLMatcher  # Private API, intended for internal usage only.
>>> import httpx

>>> pattern = URLMatcher("all")
>>> pattern.matches(httpx.URL("http://example.com"))
True

# Witch scheme matching...
>>> pattern = URLMatcher("https")
>>> pattern.matches(httpx.URL("https://example.com"))
True
>>> pattern.matches(httpx.URL("http://example.com"))
False

# With domain matching...
>>> pattern = URLMatcher("https://example.com")
>>> pattern.matches(httpx.URL("https://example.com"))
True
>>> pattern.matches(httpx.URL("http://example.com"))
False
>>> pattern.matches(httpx.URL("https://other.com"))
False

# Wildcard scheme, with domain matching...
>>> pattern = URLMatcher("all://example.com")
>>> pattern.matches(httpx.URL("https://example.com"))
True
>>> pattern.matches(httpx.URL("http://example.com"))
True
>>> pattern.matches(httpx.URL("https://other.com"))
False

# With port matching...
>>> pattern = URLMatcher("https://example.com:1234")
>>> pattern.matches(httpx.URL("https://example.com:1234"))
True
>>> pattern.matches(httpx.URL("https://example.com"))
False

Here's why we care about it...

  • The implementation in transport_for_url is much more clear now.
  • This initial pass doesn't also hook in the no_proxies support, but we'll be able to extend this slightly to deal with matching against NO_PROXY, and allow us to resolve Don't call should_not_be_proxied on each request #1062 neatly.
  • We'll be able to use this as the basis for our Mount API. Mount API #977

Note that some of the cases in test_proxies_parameter have changed, but they're all odd cases, where the desired behaviour is ambiguous, and I don't think we're actually regressing in any of those case.

  • If a user explicitly uses proxies={"http://example.com:80": httpx.Proxy(...)}, then we now only match if the requested URL includes an explicit port 80. That seems pretty reasonable. It's an odd key to use in any case.
  • If a user uses proxies={"http://example.com": httpx.Proxy(...)}, then we now match both example.com and example.com:<some port> to the proxy. That seems like an improvement in expected behaviour.
  • A malformed request URL of "http://example.com:443" will now match against "http". Pretty ambiguous what you'd expect there, but if anything it seems reasonable. Likely you'll end up with a transport error later down the line, but that's to be expected. (Or not, depending on what the heck you're actually doing there with the kludgy scheme/port mismatch there.)

@tomchristie tomchristie added this to the v0.14 milestone Jul 29, 2020
@tomchristie tomchristie added the refactor Issues and PRs related to code refactoring label Jul 29, 2020
@tomchristie tomchristie requested a review from a team July 30, 2020 12:29
Copy link
Member

@florimondmanca florimondmanca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍 (Small typo nit if you're up for it.)

Co-authored-by: Florimond Manca <[email protected]>
@tomchristie tomchristie merged commit df54890 into master Jul 31, 2020
@tomchristie tomchristie deleted the url-matching branch July 31, 2020 09:11
This was referenced Jul 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor Issues and PRs related to code refactoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Don't call should_not_be_proxied on each request
2 participants