Skip to content

url: re-investigate the usefulness of url.originFor() #10800

@TimothyGu

Description

@TimothyGu

Some backgrounds. Doubts on the usefulness of URL.originFor() static method were first cast in #10374, which is now marked as closed as the function was moved to require('url'). However, its usefulness remains questionable.

@domenic first inquired in #10374 (comment), without an apparent reply:

Hmm, what is the benefit of that over new URL(...).origin?

This issue is echoed in my #10620 (comment), and @targos' #10620 (comment), in which it is agreed to move any conversation on the utility of originFor elsewhere. This issue is dedicated to this exact discussion.

What is this function in the first place?

The originally proposed documentation says the following about url.originFor() (retrievable through jasnell/node@6b6c374; since removed from #10620):

Returns an object representing the origin of the given URL. The origin object is considered to be opaque. That is, while there are properties and methods exported by the object, they are not considered to be part of the "public" API of the object.

Basically, it is saying that this function returns an object that should not be consumed in anyway by the caller. This alone undermines its utility for the generic user of Node.js.

From the standpoint of Web standards implemented, the function performs the operation outlined in URL Standard § Origin, which only specifies the format of a URL origin in spec-level (rather than IDL-level or ES-level), and thus does not have any analogs in the browser.

For a list of whitelisted protocols, the function returns a TupleOrigin object which is like a lite-URL, representing the origin of the main URL. For non-whitelisted protocols, it returns an OpaqueOrigin object, which only exposes a toString method that returns 'null' and an effectiveDomain which has a getter that returns the OpaqueOrigin itself.

Should we make the origin object not opaque?

We can, of course, but the usefulness of this is contested, for two reasons. First, the entire concept of origin is tightly connected to Web security in a way that is arguably not applicable to server-side usage (whitelisting of select protocols). Second, almost all of its features can be achieved through the URL class:

const url = require('url');
const { URL } = url;

let str = 'http://...';

url.originFor(str).toString()      === new URL(str).origin

// the following assumes `str` contains a valid URL of common
// protocols like HTTP(S), FTP, WS(S), Gopher, etc. (i.e. the origin
// is a tuple origin, instead of opaque origin)
url.originFor(str).scheme          === new URL(new URL(str).origin).protocol
url.originFor(str).host            === new URL(new URL(str).origin).host
url.originFor(str).port            === new URL(new URL(str).origin).port
url.originFor(str).domain          === new URL(new URL(str).origin).domain

The only property that cannot be translated directly is TupleOrigin's effectiveDomain, the concept behind which is only used once in the entire WHATWG HTML Standard (to help specify document.domain) and unused in the URL Standard at all.

What can we do at this point?

I don't think there will be any ill effect if we simply remove url.originFor() outright. After all, it is never documented, and even in the pending documentation (which lists the entire WHATWG URL module as Experimental) this function is not included.

Or, if we decide to keep this function, we should first add comprehensive unit tests for this function, and then document and expose the returned origin object fully.

No matter which way we take, this current state of being in limbo will only confuse potential users.

Metadata

Metadata

Assignees

No one assigned

    Labels

    whatwg-urlIssues and PRs related to the WHATWG URL implementation.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions