Skip to content

FileSystemGlobbing.Matcher is not matching '**/*' #44767

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
JoeRobich opened this issue Nov 16, 2020 · 16 comments
Closed

FileSystemGlobbing.Matcher is not matching '**/*' #44767

JoeRobich opened this issue Nov 16, 2020 · 16 comments
Assignees
Milestone

Comments

@JoeRobich
Copy link
Member

Description

In dotnet/format we use the FileSystemGlobbing.Matcher to allow users the ability to filter the files that are being formatted. We started getting reports that dotnet/format was not formatting the expected files. While troubleshooting it became apparent that the Matcher was no longer matching paths as expected.

Configuration

This simple console app tries to match different path strings using the "Any files in any subdirectory" pattern.

TestConsole.csproj:

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net5.0</TargetFramework>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Microsoft.Extensions.FileSystemGlobbing" Version="5.0.0" />
  </ItemGroup>
</Project>

Program.cs:

using System;
using Microsoft.Extensions.FileSystemGlobbing;

class Program
{
    static void Main()
    {
        var fileMatcher = new Matcher();
        fileMatcher.AddInclude("**/*");

        var fakeWindowsPath = "C:\\This\\is\\a\\nested\\windows-like\\path\\somefile.cs";
        Console.WriteLine($"**/* matches {fakeWindowsPath}: {fileMatcher.Match(fakeWindowsPath).HasMatches}");

        var fakeUnixPath = "/This/is/a/nested/unix-like/path/somefile.cs";
        Console.WriteLine($"**/* matches {fakeUnixPath}: {fileMatcher.Match(fakeUnixPath).HasMatches}");
    }
}

Output running on MacOS:

**/* matches C:\This\is\a\nested\windows-like\path\somefile.cs: True
**/* matches /This/is/a/nested/unix-like/path/somefile.cs: False

Output running on Windows:

**/* matches C:\This\is\a\nested\windows-like\path\somefile.cs: False
**/* matches /This/is/a/nested/unix-like/path/somefile.cs: False

Regression?

This had worked at least as late as .NET 5 Preview 7.

@ghost
Copy link

ghost commented Nov 17, 2020

Tagging subscribers to this area: @maryamariyan
See info in area-owners.md if you want to be subscribed.

Issue Details
Description:

Description

In dotnet/format we use the FileSystemGlobbing.Matcher to allow users the ability to filter the files that are being formatted. We started getting reports that dotnet/format was not formatting the expected files. While troubleshooting it became apparent that the Matcher was no longer matching paths as expected.

Configuration

This simple console app tries to match different path strings using the "Any files in any subdirectory" pattern.

TestConsole.csproj:

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net5.0</TargetFramework>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Microsoft.Extensions.FileSystemGlobbing" Version="5.0.0" />
  </ItemGroup>
</Project>

Program.cs:

using System;
using Microsoft.Extensions.FileSystemGlobbing;

class Program
{
    static void Main()
    {
        var fileMatcher = new Matcher();
        fileMatcher.AddInclude("**/*");

        var fakeWindowsPath = "C:\\This\\is\\a\\nested\\windows-like\\path\\somefile.cs";
        Console.WriteLine($"**/* matches {fakeWindowsPath}: {fileMatcher.Match(fakeWindowsPath).HasMatches}");

        var fakeUnixPath = "/This/is/a/nested/unix-like/path/somefile.cs";
        Console.WriteLine($"**/* matches {fakeUnixPath}: {fileMatcher.Match(fakeUnixPath).HasMatches}");
    }
}

Output running on MacOS:

**/* matches C:\This\is\a\nested\windows-like\path\somefile.cs: True
**/* matches /This/is/a/nested/unix-like/path/somefile.cs: False

Output running on Windows:

**/* matches C:\This\is\a\nested\windows-like\path\somefile.cs: False
**/* matches /This/is/a/nested/unix-like/path/somefile.cs: False

Regression?

This had worked at least as late as .NET 5 Preview 7.

Author: JoeRobich
Assignees: -
Labels:

area-Extensions-FileSystem, regression-from-last-release, untriaged

Milestone: -

@maryamariyan maryamariyan removed the untriaged New issue has not been triaged by the area owner label Nov 18, 2020
@maryamariyan maryamariyan added this to the 6.0.0 milestone Nov 18, 2020
@maryamariyan
Copy link
Member

I doubt the regression is from a change within FileSystemGlobbing itself specifically, as I see no recent changes there from .NET 5 Preview 7. Will dig into its references too to see what changed.

@maryamariyan
Copy link
Member

cc: @carlossanlop @jozkee

@maryamariyan
Copy link
Member

maryamariyan commented Nov 18, 2020

@JoeRobich To confirm regression, I also ran the same sample code on netcoreapp3.0 using 3.0 version of M.E.FileSyetemGlobbing on both mac and windows, and see the same result (False/False on windows and True/False on mac).

Perhaps not a regression? Am I missing something?
What is the expected result here?

@JoeRobich
Copy link
Member Author

JoeRobich commented Nov 18, 2020

@maryamariyan Sorry, after looking at the implementation of InMemoryDirectoryInfo, I see that it only returns filepaths that begin with the current working directory for matching. The reason on unix-like environments that there is a match for windows-like paths is due to Path.GetFullPath prepending the current working directory during path normalization ("/Users/joeyrobichaud/Source/TestConsole/C:\This\is\a\nested\windows-like\path\somefile.cs").

This explains why the behavior in dotnet-format was broken, since it isn't unreasonable for users to run formatting on code outside of the current working directory. Do you have any suggestions for how we can use this library for matching file names to patterns without the requirement that all file paths be beneath the current working directory?

@carlossanlop
Copy link
Contributor

carlossanlop commented Nov 18, 2020

@JoeRobich your path's format should match the expected one for the current OS.

If you are using the following path, which you shared, in a Unix environment:

        var fakeWindowsPath = "C:\\This\\is\\a\\nested\\windows-like\\path\\somefile.cs";

Then you will not get the same result you would get in Windows, because:

  • In Unix, the backslash \ character is a valid path character, so it's not officially considered a separator:
carlos@carlospc:~/tmp$ mkdir "my\folder"
carlos@carlospc:~/tmp$ ls -l
total 4
drwxr-xr-x 2 carlos carlos 4096 Nov 18 13:11 'my\folder'
carlos@carlospc:~/tmp$ rmdir "my\folder"
  • The colon : character is a valid path character:
carlos@carlospc:~/tmp$ mkdir "my:folder"
carlos@carlospc:~/tmp$ ls -l
total 4
drwxr-xr-x 2 carlos carlos 4096 Nov 18 13:12 my:folder
  • Since your path does not begin with the Unix root character /, it is not considered rooted by the OS, so it gets normalized to ensure we pass the full path, as you described, which ends up prepending the current directory.

On the other hand, Windows is a bit more flexible when you pass it a Unix path.

For example, the following path, which you shared, if you use it in a Windows machine:

        var fakeUnixPath = "/This/is/a/nested/unix-like/path/somefile.cs";

Then Windows will:

  • Determine that forward slash / is a valid separator character.
  • Your path is rooted because it starts with /.
  • Your path does not specify a drive, so it will assume you are working on the current process' drive.
  • When the path gets normalized, it will prepend the drive letter and a colon at the beginning, and then switch all the forward slashes to backslashes.

So my suggestions are:

  • Make sure you are dealing with valid paths for the current OS.
  • Pass a full path to the globbing method, so that you don't get the current process' path unexpectedly prepended during normalization.

Hope that helps. Please let me know if you have any more questions.

@JoeRobich
Copy link
Member Author

JoeRobich commented Nov 18, 2020

@carlossanlop Sorry, I don't have a problem with the normalization of paths.

Our paths are full paths and correct for the OS. The issue is that if the path doesn't begin with the current working directory the Matcher will not produce matches (https://github.com/dotnet/runtime/blob/master/src/libraries/Microsoft.Extensions.FileSystemGlobbing/src/InMemoryDirectoryInfo.cs#L86). If you can give guidance on how to work around this current working directory requirement it would be appreciated.

@maryamariyan
Copy link
Member

Just to reconfirm @JoeRobich this is not a regression right?

@JoeRobich
Copy link
Member Author

@maryamariyan It appears to not be a regression.

@jozkee
Copy link
Member

jozkee commented Nov 23, 2020

InMemoryDirectoryInfo, I see that it only returns filepaths that begin with the current working directory for matching.

I think that's not true, absolute paths as the one you mention, C:\This\is\a\nested\windows-like\path\somefile.cs, can be matched correctly.

Do you have any suggestions for how we can use this library for matching file names to patterns without the requirement that all file paths be beneath the current working directory?

MatcherExtensions.Match method can receive a rootDir, by default uses the current working directory, that's why in your samples it only matches files in such directory.

@JoeRobich are you able to provide a root directory to the method to workaround this?
I would like to suggest to just provide the root to be able to match any path in any directory but as I said in my second comment, there is a bug with paths that end with a separator.

@jozkee
Copy link
Member

jozkee commented Nov 23, 2020

I also found that there is a bug if you use a root directory that end with a separator, such as C:\.

if (!filePath.StartsWith(rootDir, StringComparison.Ordinal) ||
filePath.IndexOf(Path.DirectorySeparatorChar, rootDir.Length) != rootDir.Length)

Files won't match rootDir in said case because of the second condition. That can be fixed by avoiding the filePath.IndexOf(Path.DirectorySeparatorChar, rootDir.Length) != rootDir.Length check when rootDir ends with separator.

@maryamariyan
Copy link
Member

@jozkee that bug is tracked with #36415

@JoeRobich
Copy link
Member Author

I think that's not true, absolute paths as the one you mention, C:\This\is\a\nested\windows-like\path\somefile.cs, can be matched correctly.

@jozkee the only reason the windows style path matched was because the path normalization on a unix like environment prepends the current directory. The StartsWith check on line 122 of the InMemoryDirectoryInfo blocks emitting paths that do not start with the rootDir.

Thanks for suggesting the overload. I agree that the bug you mentioned is blocking our ability to simply use Matcher.Match(rootDir: Path.GetPathRoot(somePath), file: somePath).HasMatches.

@jozkee
Copy link
Member

jozkee commented Mar 31, 2021

Fixed in #45139
@JoeRobich can you please try again using tomorrow's prerelease build of Microsoft.Extensions.FileSystemGlobbing?
You can find it in this NuGet package source https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet6/nuget/v3/index.json.

@JoeRobich
Copy link
Member Author

Thanks @jozkee! I will give this a try tomorrow.

@adamsitnik
Copy link
Member

Fixed by #45139, closing

@ghost ghost locked as resolved and limited conversation to collaborators Aug 27, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants