Skip to content

path/filepath: Walk recurses on directory symlinks on Windows #17540

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jim-minter opened this issue Oct 21, 2016 · 17 comments
Closed

path/filepath: Walk recurses on directory symlinks on Windows #17540

jim-minter opened this issue Oct 21, 2016 · 17 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. OS-Windows
Milestone

Comments

@jim-minter
Copy link

In the documentation to filepath.Walk, it says "Walk does not follow symbolic links." This is not the case on Windows, and I think it's a bug.

What version of Go are you using (go version)?

$ go version
go version go1.7.1 windows/amd64

What operating system and processor architecture are you using (go env)?

$ go env
set GOARCH=amd64
set GOBIN=
set GOEXE=.exe
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOOS=windows
set GOPATH=x:\go
set GORACE=
set GOROOT=C:\Go
set GOTOOLDIR=C:\Go\pkg\tool\windows_amd64
set CC=gcc
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0
set CXX=g++
set CGO_ENABLED=1

What did you do?

Run the following program:

package main

import (
    "fmt"
    "io/ioutil"
    "os"
    "path/filepath"
)

func main() {
    tempdir, err := ioutil.TempDir("", "")
    if err != nil {
        panic(err)
    }
    defer os.RemoveAll(tempdir)

    err = os.Symlink(`c:\`, filepath.Join(tempdir, "directorysymlink"))
    // err = os.Symlink("/", filepath.Join(tempdir, "directorysymlink"))
    if err != nil {
        panic(err)
    }

    err = filepath.Walk(tempdir, func(path string, info os.FileInfo, err error) error {
        if err != nil {
            return err
        }

        /*
        if info.Mode()&os.ModeSymlink != 0 && info.Mode()&os.ModeDir != 0 {
            return filepath.SkipDir
        }
        */

        fmt.Println(path)

        return nil
    })

    if err != nil {
        panic(err)
    }
}

What did you expect to see?

I expect filepath.Walk not to recurse into tmpdir/directorysymlink. So I expect to see:

C:\cygwin\tmp\337883043
C:\cygwin\tmp\337883043\directorysymlink

When run on a non-Windows system, this is the behaviour seen.

What did you see instead?

I see two problems.

  1. filepath.Walk recurses into tmpdir/directorysymlink (this issue).
  2. filepath.Walk hits an endless recursive loop due to path/filepath: Walk walks recursive NTFS junction points #10424. NB: This issue refers to NTFS junction points, whereas I am referring to directory symlinks.
C:\cygwin\tmp\902833999
C:\cygwin\tmp\902833999\directorysymlink
...
C:\cygwin\tmp\902833999\directorysymlink\Documents and Settings
C:\cygwin\tmp\902833999\directorysymlink\Documents and Settings\All Users
C:\cygwin\tmp\902833999\directorysymlink\Documents and Settings\All Users\Application Data
C:\cygwin\tmp\902833999\directorysymlink\Documents and Settings\All Users\Application Data\Application Data
C:\cygwin\tmp\902833999\directorysymlink\Documents and Settings\All Users\Application Data\Application Data\Application Data
C:\cygwin\tmp\902833999\directorysymlink\Documents and Settings\All Users\Application Data\Application Data\Application Data\Application Data
C:\cygwin\tmp\902833999\directorysymlink\Documents and Settings\All Users\Application Data\Application Data\Application Data\Application Data\Application Data
C:\cygwin\tmp\902833999\directorysymlink\Documents and Settings\All Users\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data
C:\cygwin\tmp\902833999\directorysymlink\Documents and Settings\All Users\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data
C:\cygwin\tmp\902833999\directorysymlink\Documents and Settings\All Users\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data
C:\cygwin\tmp\902833999\directorysymlink\Documents and Settings\All Users\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data
C:\cygwin\tmp\902833999\directorysymlink\Documents and Settings\All Users\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data
panic: GetFileAttributesEx C:\cygwin\tmp\902833999\directorysymlink\Documents and Settings\All Users\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data: The system cannot find the path specified.

Final notes:

  1. a workaround for getting better behaviour is presented in the code above commented with /.../, but I don't think the library caller should be required to do this.
  2. I think Go should respect the underlying platform's (right or wrong!) decision to allow filesystem objects which report as both directory and symlink, and handle this appropriately in filepath.Walk by casting out directories which are also symlinks.
  3. FWIW, I think path/filepath: Walk walks recursive NTFS junction points #10424 is legitimate and filepath.Walk should also not follow directory junctions (hardlinks), so as to avoid recursive loops.
@hirochachacha
Copy link
Contributor

You've already know the problem and the fix looks very easy.
Do you have a plan to send a CL?

@quentinmit quentinmit added the NeedsFix The path to resolution is known, but the work has not been done. label Oct 21, 2016
@quentinmit quentinmit added this to the Go1.8Maybe milestone Oct 21, 2016
@quentinmit
Copy link
Contributor

I disagree about junctions; those seem like Unix mountpoints, which definitely should be descended.

@quentinmit quentinmit changed the title On Windows filepath.Walk recurses on directory symlinks path/filepath: Walk recurses on directory symlinks on Windows Oct 21, 2016
@hirochachacha
Copy link
Contributor

os.Readlink and os.ModeSymlink handle junctions as symlinks.
So I think the same decision here is nothing bad.

@alexbrainman
Copy link
Member

I agree, I think we should not walk both "directory junctions" (issue #10424) and "symlinks to directories" (this issue). Otherwise how do we avoid loops?

Alex

@alexbrainman
Copy link
Member

Something like this:

diff --git a/src/os/types_windows.go b/src/os/types_windows.go
index ad4e863..0d3b158 100644
--- a/src/os/types_windows.go
+++ b/src/os/types_windows.go
@@ -32,16 +32,16 @@ func (fs *fileStat) Mode() (m FileMode) {
    if fs == &devNullStat {
        return ModeDevice | ModeCharDevice | 0666
    }
-   if fs.sys.FileAttributes&syscall.FILE_ATTRIBUTE_DIRECTORY != 0 {
-       m |= ModeDir | 0111
-   }
    if fs.sys.FileAttributes&syscall.FILE_ATTRIBUTE_READONLY != 0 {
        m |= 0444
    } else {
        m |= 0666
    }
    if fs.sys.FileAttributes&syscall.FILE_ATTRIBUTE_REPARSE_POINT != 0 {
-       m |= ModeSymlink
+       return m | ModeSymlink
+   }
+   if fs.sys.FileAttributes&syscall.FILE_ATTRIBUTE_DIRECTORY != 0 {
+       m |= ModeDir | 0111
    }
    if fs.pipe {
        m |= ModeNamedPipe

should fix this issue.

Alex

@jim-minter
Copy link
Author

@quentinmit one of the problems with junction points is if you follow them, modern Windows boxes have a directory loop out of the box if you do a filepath.Walk on C:.

From an end user perspective I guess the alternatives are:

  • no change, in which case any developer on Windows probably needs to detect these specially and return SkipDir. Ported applications developed on other OSes risk falling into a directory loop bug.
  • change the behaviour, in which case a particularly interested developer on Windows can detect junction points, decide about loop detection and issue another filepath.Walk if appropriate (just as a developer would on Linux with symlinks).

FWIW I think the latter is better library default behaviour.

@jim-minter
Copy link
Author

@alexbrainman I think that on the one hand your suggested patch would probably have the advantage of solving #17541 as well. On the other hand it somewhat masks the fact that according to Windows the underlying object seems to be considered both a directory and a reparse point. I wonder whether a higher level adjustment in filepath.Walk itself (and in #17541, a higher level adjustment in tar.FileInfoHeader) is better. I don't know - I think there's an argument either way.

@hirochachacha
Copy link
Contributor

I imagined like this

diff --git a/src/path/filepath/path.go b/src/path/filepath/path.go
index 1d8e35c..7b4548c 100644
--- a/src/path/filepath/path.go
+++ b/src/path/filepath/path.go
@@ -346,17 +346,22 @@ type WalkFunc func(path string, info os.FileInfo, err error) error

 var lstat = os.Lstat // for testing

+func canWalk(info os.FileInfo) bool {
+   // windows's symlink could return info.IsDir() == true
+   return info.IsDir() && info.Mode()&os.ModeSymlink == 0
+}
+
 // walk recursively descends path, calling w.
 func walk(path string, info os.FileInfo, walkFn WalkFunc) error {
    err := walkFn(path, info, nil)
    if err != nil {
-       if info.IsDir() && err == SkipDir {
+       if canWalk(info) && err == SkipDir {
            return nil
        }
        return err
    }

-   if !info.IsDir() {
+   if !canWalk(info) {
        return nil
    }

@@ -375,7 +380,7 @@ func walk(path string, info os.FileInfo, walkFn WalkFunc) error {
        } else {
            err = walk(filename, fileInfo, walkFn)
            if err != nil {
-               if !fileInfo.IsDir() || err != SkipDir {
+               if !canWalk(fileInfo) || err != SkipDir {
                    return err
                }
            }

@alexbrainman
Copy link
Member

I wonder whether a higher level adjustment in filepath.Walk itself ... is better

I considered that too. And I am not sure which approach is better. Like you said, unlike Unix, Windows returns both "is directory" and "is symlink" for some directory elements. Perhaps that confused authors of path/filepath.Walk - I suspect they did not consider that possibility.

So the question is, do we:

  1. want "os" package on Windows imitate what Unix does (os.FileMode to have either ModeDir or ModeSymlink, but not both)

or

  1. accept what Windows returns for os.FileMode, and adjust all code outside of "os" package

?

I will let others decide here.

Alex

@rsc
Copy link
Contributor

rsc commented Oct 24, 2016

It is definitely unexpected that the mode would be both a directory and a symlink. It should be one or the other.

That said, it's important that every directory is reachable using a walk that stops at symlinks. I don't understand the specific details here, but, for example, if the root of a mounted file system were reported as a symlink with the effect that you can't ever walk into the file system at all, that would be a problem.

If we know there is a perfectly good, other, canonical name, then yes, the noncanonical one should be a symlink only, not a symlink and a directory.

It also matters here whether Lstat or Stat is being called. Lstat is supposed to report symlinks. Stat is supposed to not report them, instead following down to the thing being pointed at. Walk should be able to use Lstat to avoid infinite recursion.

@alexbrainman, given this, can you say a bit more about what you think should happen?

Thanks.

@alexbrainman
Copy link
Member

I don't understand the specific details here ...

I do not know much about symlinks myself. Symlinks appeared in Windows Vista. Windows XP did not have symlinks, but it had "Directory Junction" - these are similar to symlinks to directories - as far as I understand they have very similar functionality, but different implementation. Symlinks and "directory junctions" can be created with mklink program that comes with Windows.

C:\>mklink
Creates a symbolic link.

MKLINK [[/D] | [/H] | [/J]] Link Target

        /D      Creates a directory symbolic link.  Default is a file
                symbolic link.
        /H      Creates a hard link instead of a symbolic link.
        /J      Creates a Directory Junction.
        Link    specifies the new symbolic link name.
        Target  specifies the path (relative or absolute) that the new link
                refers to.

C:\>mkdir a

C:\>cd a

C:\a>mkdir dir

C:\a>echo aaa > dir\file1

C:\a>echo bbb > file2

C:\a>mklink s1 dir\file1
symbolic link created for s1 <<===>> dir\file1

C:\a>mklink s2 file2
symbolic link created for s2 <<===>> file2

C:\a>mklink /d s3 dir
symbolic link created for s3 <<===>> dir

C:\a>mklink /j s4 dir
Junction created for s4 <<===>> dir

C:\a>dir
 Volume in drive C has no label.
 Volume Serial Number is 1234-ABCD

 Directory of C:\a

28/10/2016  12:57 PM    <DIR>          .
28/10/2016  12:57 PM    <DIR>          ..
28/10/2016  12:55 PM    <DIR>          dir
28/10/2016  12:55 PM                 6 file2
28/10/2016  12:56 PM    <SYMLINK>      s1 [dir\file1]
28/10/2016  12:56 PM    <SYMLINK>      s2 [file2]
28/10/2016  12:56 PM    <SYMLINKD>     s3 [dir]
28/10/2016  12:57 PM    <JUNCTION>     s4 [C:\a\dir]
               3 File(s)              6 bytes
               5 Dir(s)   1,190,907,904 bytes free

C:\a>type s1
aaa

C:\a>type s2
bbb

C:\a>dir s3
 Volume in drive C has no label.
 Volume Serial Number is 1234-ABCD

 Directory of C:\a\s3

28/10/2016  12:55 PM    <DIR>          .
28/10/2016  12:55 PM    <DIR>          ..
28/10/2016  12:55 PM                 6 file1
               1 File(s)              6 bytes
               2 Dir(s)   1,190,907,904 bytes free

C:\a>dir s4
 Volume in drive C has no label.
 Volume Serial Number is 1234-ABCD

 Directory of C:\a\s4

28/10/2016  12:55 PM    <DIR>          .
28/10/2016  12:55 PM    <DIR>          ..
28/10/2016  12:55 PM                 6 file1
               1 File(s)              6 bytes
               2 Dir(s)   1,190,907,904 bytes free

C:\a>mkdir dir2

C:\a>mklink /d dir2 dir
Cannot create a file when that file already exists.

C:\a>mklink /j dir2 dir
Cannot create a file when that file already exists.

C:\a>

The API we use to create symlinks is CreateSymbolicLink.

... , but, for example, if the root of a mounted file system were reported as a symlink with the effect that you can't ever walk into the file system at all, that would be a problem.

I do not see how you can replace root directory with "symlink directory" or "directory junctions".

C:\a>mklink /j c:\ s5
Access is denied.

C:\a>mklink /d c:\ s6
Access is denied.

C:\a>

If we know there is a perfectly good, other, canonical name, then yes, the noncanonical one should be a symlink only, not a symlink and a directory.

I do not understand your suggestion.

It also matters here whether Lstat or Stat is being called. Lstat is supposed to report symlinks. Stat is supposed to not report them, instead following down to the thing being pointed at. Walk should be able to use Lstat to avoid infinite recursion.

I think Stat and Lstat work as expected:

C:\a>type c:\dev\src\t\main.go
package main

import (
        "fmt"
        "log"
        "os"
)

func readDir(dirname string) error {
        dir, err := os.Open(dirname)
        if err != nil {
                return err
        }
        names, err := dir.Readdirnames(-1)
        if err != nil {
                return err
        }
        for _, name := range names {
                fmt.Printf("\n%s:\n", name)
                fs, err := os.Stat(name)
                if err != nil {
                        return err
                }
                fmt.Printf("%v %v (stat)\n", fs.Mode(), fs.Name())
                fs, err = os.Lstat(name)
                if err != nil {
                        return err
                }
                fmt.Printf("%v %v (lstat)\n", fs.Mode(), fs.Name())
        }
        return nil
}

func main() {
        err := readDir(".")
        if err != nil {
                log.Fatal(err)
        }
}

C:\a>go run c:\dev\src\t\main.go

dir:
drwxrwxrwx dir (stat)
drwxrwxrwx dir (lstat)

dir2:
drwxrwxrwx dir2 (stat)
drwxrwxrwx dir2 (lstat)

file2:
-rw-rw-rw- file2 (stat)
-rw-rw-rw- file2 (lstat)

s1:
-rw-rw-rw- file1 (stat)
Lrw-rw-rw- s1 (lstat)

s2:
-rw-rw-rw- file2 (stat)
Lrw-rw-rw- s2 (lstat)

s3:
drwxrwxrwx dir (stat)
dLrwxrwxrwx s3 (lstat)

s4:
drwxrwxrwx dir (stat)
dLrwxrwxrwx s4 (lstat)

C:\a>

I also looked at actual Stat and Lstat implementation.

So what do you think we should do here?

Alex

@hirochachacha
Copy link
Contributor

By the way, If I mount a NTFS drive on linux, can I see the same issue? or its driver cooks for me?
I think this is not an OS specific issue, but a file system specific one.
In general, we cannot say symlink and directory is exclusive,
it depends on the file system and its driver.
If we choose changing os package, we need to consider other OSes also.
I don't have a strong opinion though.

@alexbrainman
Copy link
Member

If I mount a NTFS drive on linux, can I see the same issue?

I do not see how that is possible. See https://github.com/golang/go/blob/master/src/os/stat_linux.go#L17 for details. Each file is either symlink or directory, cannot be both.

Alex

@hirochachacha
Copy link
Contributor

I do not see how that is possible. See https://github.com/golang/go/blob/master/src/os/stat_linux.go#L17 for details. Each file is either symlink or directory, cannot be both.

Thank you for pointing out. So I understand that current behavior doesn't follow POSIX.

http://pubs.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html

@rsc
Copy link
Contributor

rsc commented Nov 11, 2016

This is a bit involved for the freeze. Leaving for Go 1.9.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/36854 mentions this issue.

spenczar added a commit to spenczar/gps that referenced this issue Mar 21, 2017
Relative symlinks on windows confuse filepath.Walk. This is Go issue 17540,
golang/go#17540. While waiting for that to get fixed,
just use absolute symlinks in windows tests.
gopherbot pushed a commit that referenced this issue Apr 5, 2017
For #17540.

Change-Id: Ie01f39797526934fa553f4279cbde6c7cbf14154
Reviewed-on: https://go-review.googlesource.com/36854
Reviewed-by: Ian Lance Taylor <[email protected]>
Run-TryBot: Ian Lance Taylor <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
krisnova pushed a commit to krisnova/dep that referenced this issue Apr 21, 2017
Relative symlinks on windows confuse filepath.Walk. This is Go issue 17540,
golang/go#17540. While waiting for that to get fixed,
just use absolute symlinks in windows tests.
@gopherbot
Copy link
Contributor

CL https://golang.org/cl/41830 mentions this issue.

kivisade added a commit to kivisade/fsized that referenced this issue Jul 2, 2017
…s on Windows" issue (see golang/go#17540)

[~] command output now shows how much actual overhead is better/worse than rough estimate
@golang golang locked and limited conversation to collaborators Apr 26, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. OS-Windows
Projects
None yet
Development

No branches or pull requests

6 participants