Handling of legacy-incompatible paths by PathBuf::push
on Windows
#135443
Labels
A-io
Area: `std::io`, `std::fs`, `std::net` and `std::path`
C-discussion
Category: Discussion or questions that doesn't represent real issues.
O-windows
Operating system: Windows
T-libs-api
Relevant to the library API team, which will review and decide on the PR/issue.
Should
pathbuf.push()
on Windows convert legacy paths to UNC syntax when they exceed theMAX_PATH
length, or other limits?Legacy MS-DOS (non-UNC) paths should not exceed
MAX_PATH
length (~260 chars) and have other syntactic limitations. Windows has partial, opt-in support for longer legacy paths, but not all APIs and not all applications support long legacy paths.Windows has extended-length paths that start with a
\\?\
(UNC) prefix. They are Microsoft's preferred way of specifying long paths. They can be 32KB long, and don't have path handling quirks inherited from MS-DOS. However, the UNC paths are not properly supported by many Windows applications #42869.This question is related to
fs::canonicalize()
that currently returns UNC paths even when not necessary.fs::canonicalize()
would be more compatible with Windows apps if it returned legacy paths whenever possible (when they fit underMAX_PATH
and meet other restrictions like reserved names). However, the legacy paths have the length limit, so whetherfs::canonicalize(short_path).push(very_long_path)
works as expected depends on which syntaxfs::canonicalize
uses, or whetherpath()
will correct the syntax if necessary.Currently
push()
does not convert between legacy and UNC paths. If apush()
causes a legacy path to exceed the limit, the path will keep using the legacy syntax, and technically won't be valid in APIs/apps that have theMAX_PATH
limit. This is a simple, predictable implementation, but arguably makes it possible forpush()
to create an invalid path, a syntax error.Besides the length limit, there are issues with reserved file names and trailing whitespace.
legacy_path.push("con.txt")
makes the whole path parse as a device name only, butunc_path.push("con.txt")
simply appends thecon.txt
file name to the path as expected. Is this a bug inpush()
? Shouldpush
be a leaky abstraction that exposes quirks of how Windows parses legacy paths, or shouldpush()
switch to the less common UNC path syntax when it's necessary to precisely and losslessly append the given path components?If
push()
converted paths to UNC whenever they exceed limits of legacy paths, thenpush()
would be more robust, and semantically closer topush()
on Unix that always appends components, rather than pushing characters that may cause the whole path to parse as something else, or get rejected entirely for not using a syntax appropriate for its length.The text was updated successfully, but these errors were encountered: