Skip to content
This repository was archived by the owner on Nov 20, 2018. It is now read-only.

Publish the spec behind the Micrsoft Querystring Parsing and Serialization rules #67

Closed
lilith opened this issue May 30, 2014 · 5 comments
Labels

Comments

@lilith
Copy link

lilith commented May 30, 2014

Could we get some light shed on the whys/hows of querystring handling logic in vNext? Previous versions have been inconsistent at best, and this seems like a great chance to fix things.

An all-encompassing parsing strategy may be possible, but inadvisable, as the logic could not be clearly communicated to developers. I would suggest instead that parsing/serialization formats be named specifically for their appropriate use.

For example, requests with content-type application/x-www-form-urlencoded must follow the WHATWG parsing/serialization spec — which differs from php, ruby, node, python, classic asp, and each implementation in asp.net vNow, most of which have differing rules on one or more points:

  1. Handling of duplicate keys. (error, replace, concatenate, or build list)?
  2. Hash serialization. key[a]=1&key[b]=2 produces key = {a=1, b=2} in some implementations
  3. Array serialization. a=1&a=2, a=1,2, a[]=1&a[]=2, a[1]=1, a[2]=2` are all valid ways to represent an array value on different platforms.
  4. Handling of invalid characters (error, or pretend they were URL encoded)?
  5. Encoding of reserved characters (Most frameworks fail to url encode reserved characters correctly)
  6. Order preservation - some developers (inadvisably) rely on the order of querystring pairs. A recent instance of this is in the latest version of Umbraco.
  7. Case preservation and case sensitivity.
  8. URL decoding pass count (AFAIK, only ASP.NET WebForms messes this up). Ideally, +, %20 -> should be the only lossy operation.

Once we know the team's opinions on querystrings (and paths - and PathInfos), I think there are a lot of developers that would be happy to pitch in with unit tests and compatibility profiles. I imagine that Javascript/Node and php compatibility would be the highest priority.

Of course, this is easiest if we can ensure that we do not lose any data between the network packet and the developer. I'm looking at you, IIS.

@lilith
Copy link
Author

lilith commented Jun 16, 2014

Also, a compatibility matrix between Uri.(Un)EscapeDataString and the HttpUtility methods would be useful if we're switching. Incorrect space handling is well-known, but unicode encoding is a different story. HttpUtility guesses encoding (often wrongly) from the request, and both appear to take very different approaches to actual byte mangling.

@glennc
Copy link
Member

glennc commented Jun 23, 2014

@GrabYourPitchforks Can you talk about this here for now?

@GrabYourPitchforks
Copy link
Contributor

Not quite sure what you mean about IIS and HttpUtility.UrlEncode/UrlDecode munging the encoding. IIS doesn't touch the query string at all (it just gets forwarded through as raw bytes), and UrlEncode/UrlDecode always assumes UTF8 unless the developer has specified something else in Web.config.

The biggest difference between UnescapeDataString and HttpUtility.UrlDecode is two-fold: (a) UrlDecode can only be used for the query string, not for path segments, due to the way it handles the U+0020 code point; and (b) UrlDecode understands the non-standard %uXXXX format.

We've been working on new encoding / decoding routines which should provide a more uniform and easy-to-understand behavior. These routines will also support previous customer suggestions like leaving IRI-safe characters unescaped. I don't have an ETA for when these will be available.

@davidfowl
Copy link
Member

@GrabYourPitchforks These are available now right?

@aspnet-hello
Copy link

This issue was moved to dotnet/aspnetcore#2734

@aspnet aspnet locked and limited conversation to collaborators Jan 2, 2018
@aspnet-hello aspnet-hello removed this from the Backlog milestone Jan 2, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

6 participants