Skip to content

path/filepath: Need a way to canonicalize paths #17084

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sschuberth opened this issue Sep 13, 2016 · 15 comments
Closed

path/filepath: Need a way to canonicalize paths #17084

sschuberth opened this issue Sep 13, 2016 · 15 comments
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@sschuberth
Copy link

What version of Go are you using (go version)?

go version go1.5.3 windows/amd64

What operating system and processor architecture are you using (go env)?

set GOARCH=amd64
set GOBIN=
set GOEXE=.exe
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOOS=windows
set GOPATH=
set GORACE=
set GOROOT=C:\Go
set GOTOOLDIR=C:\Go\pkg\tool\windows_amd64
set GO15VENDOREXPERIMENT=
set CC=gcc
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0
set CXX=g++
set CGO_ENABLED=1

What did you do?

I was looking for a way to canonize a path as Abs "is not guaranteed to be unique". Calling Abs on Windows for example might return a path containing DOS-style 8.3 short names. On Linux, Abs might return a path with symlinks.

What did you expect to see?

I was expecting to find a function to canonize a path, so paths can be safely compared.

What did you see instead?

I did not find a way canonize a path / to create a unique absolute path.

@minux
Copy link
Member

minux commented Sep 13, 2016 via email

@sschuberth
Copy link
Author

How about Abs(EvalSymlinks(path)) on Linux?

That looks like a viable implementation for Linux of a function to canonize paths.

Thanks for pointing out os.SameFile() which indeed works for my use-case. However, I still feel having a function to canonize paths could be useful. Feel free to close this if you disagree.

@alexbrainman
Copy link
Member

I think EvalSymlinks does what you want on windows.

Alex

@sschuberth
Copy link
Author

I've just confirmed that something like

filepath.EvalSymlinks("C:\\Users\\sebastian\\DOWNLO~1")

will indeed return

C:\Users\sebastian\Download

on Windows, which is what I wanted. So probably EvalSymlinks()'s documentation just needs some improvement in this regard.

Thanks!

@sschuberth
Copy link
Author

I've created a small change to clarify the docs: https://go-review.googlesource.com/29055

@sschuberth
Copy link
Author

Reopening because @alexbrainman stated that we do not want to document EvalSymlinks() to work that way. So we indeed would need a separate function to canonize paths which is documented to also convert Windows short filenames to long filenames.

@sschuberth sschuberth reopened this Sep 14, 2016
@bradfitz
Copy link
Contributor

Something in golang.org/x/sys/windows perhaps?

@sschuberth
Copy link
Author

I was more thinking about an OS-independent place. golang.org/x/sys/windows seems to be more about direct 1-to-1 mappings to the Windows API. Interestingly, the Windows API has a PathCanonicalize function, but that does not do the right thing.

Basically, I'm suggesting something like Java's getCanonicalPath, implemented for all OSes.

@robpike robpike changed the title path/filepath: Need a way to canonize paths path/filepath: Need a way to canonicalize paths Sep 15, 2016
@robpike
Copy link
Contributor

robpike commented Sep 15, 2016

I'm not sure this is even possible in general, or at least I don't know how to define a portable definition of a canonical file name. Perhaps it's a Windows-specific issue.

@griesemer
Copy link
Contributor

We do have path/filepath.ToSlash which is half of the story.

@minux
Copy link
Member

minux commented Sep 15, 2016 via email

@alexbrainman
Copy link
Member

So we indeed would need a separate function to canonize paths which is documented to also convert Windows short filenames to long filenames.

Why would you need such function?

You can use os.SameFile to test if 2 files are the same. You can even use filepath.EvalSymlinks to convert real file paths in different format to the same string. For example this https://play.golang.org/p/1FnLrOOXoq outputs:

C:\Documents and Settings\brainman\.gitconfig
C:\Documents and Settings\brainman\.gitconfig
C:\Documents and Settings\brainman\.gitconfig
C:\Documents and Settings\brainman\.gitconfig
C:\Documents and Settings\brainman\.gitconfig
C:\Documents and Settings\brainman\.gitconfig

on my computer.

filepath.EvalSymlinks started as "canonical" way to express file path https://codereview.appspot.com/5713043/#msg5 And we tried to keep filepath.EvalSymlinks working this way. But I don't think there is any definition of "canonical" in Windows API. There is no Windows API that would provide that functionality. What we do in filepath.EvalSymlinks is mish and mash.

Something in golang.org/x/sys/windows perhaps?

Lets see what exactly @sschuberth wants.

We do have path/filepath.ToSlash which is half of the story.

filepath.ToSlash just converts \ into / in a string. filepath.EvalSymlinks works on real files / directories and use Windows help to discover "real" file names. As you can see from https://play.golang.org/p/1FnLrOOXoq Windows accepts many different ways to name a file, and filepath.EvalSymlinks converts all different variations into a single form.

... what's the definition for a canonical path? Every Clean'ed and
EvalSymlink'ed path is as canonical as any others.

I agree. Putting the word "canonical" in the documentation might just confuse things even more.

Maybe we should let Russ decide wha to do here. He started it https://codereview.appspot.com/5713043/#msg5 :-)

Alex

@sschuberth
Copy link
Author

[...] I don't know how to define a portable definition of a canonical file name.

Probably like Java does: "A canonical pathname is both absolute and unique. The precise definition of canonical form is system-dependent".

Perhaps it's a Windows-specific issue.

I don't think is it. Creating a canonical path involves steps that are applicable to e.g. Linux, too. Again quoting the Java docs: "This typically involves removing redundant names such as "." and ".." from the pathname, resolving symbolic links (on UNIX platforms), and converting drive letters to a standard case (on Microsoft Windows platforms)". Although not explicitly mentioned here, Java also converts short filenames in path to long filenames on Windows as part of canonization.

We do have path/filepath.ToSlash which is half of the story.

I don't think ToSlash() does any good here. I'm not talking about canonizing paths across different OSes. I.e. I'm not looking for a way to compare Windows paths to Linux paths or so. That is, a canonical path on Windows can and in fact should contain backslashes instead of slahes, as that's the OS-native way of specifying paths.

I think the requirement itself is problematic. Trying to use a path to determine two files are the same portably is inherently impossible.

I'm not sure it's impossible. But I agree using stat() like SameFile() does is much safer and cleaner.

Maybe a better example is where you want to "clean" paths input by the user before showing them in some UI.

Given that, what's the definition for a canonical path? Every Clean'ed and EvalSymlink'ed path is as canonical as any others.

That happens to be the case with the current implementation of EvalSymlinks(), yes. But @alexbrainman did not want to document that fact in order to be free to change the implementation some day so that it still evaluates symbolic links, but not resolves short filenames to long filenames on Windows anymore. And then the path returned by EvalSymlinks() would not be canonical anymore, and your statement would be wrong.

There is no Windows API that would provide that functionality. What we do in filepath.EvalSymlinks is mish and mash.

Which is just fine with me. From my point of view Golang could come up with its own definition of what a canonical path should look like for each OS, i.e. what transformations are involved, as long it's a sane and consistent definition.

@quentinmit quentinmit added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Oct 3, 2016
@quentinmit
Copy link
Contributor

/cc @rsc Apparently you have opinions on EvalSymlinks's functionality. What, if anything, is there for us to do here?

@quentinmit quentinmit added this to the Go1.8Maybe milestone Oct 3, 2016
@rsc
Copy link
Contributor

rsc commented Oct 6, 2016

I agree with the people who said this is too hard to define. Since os.SameFile solved the actual motivation here, let's wait until there's other unsolved motivation instead of doing something for hypothetical reasons.

I do think it's important that EvalSymlinks returns a consistent name for any particular directory, but I'm not particularly interested in documenting which one is returned. Probably we shouldn't.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

9 participants