Skip to content

bpo-22276: Fix pathlib.Path.glob not to ignore trailing path separator #10349

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 28, 2022

Conversation

e-kwsm
Copy link
Contributor

@e-kwsm e-kwsm commented Nov 6, 2018

See also Issue 33392: pathlib .glob('*/') returns files as well as directories, closed as duplicate of 22276.

At present pathlib.Path.glob returns directories and FILES even if glob pattern ends with the path separator (/ in Linux), while glob.glob works as expected.
This PR fixes the behaviour.

$ uname -s
Linux
$ mkdir /tmp/test-glob
$ cd /tmp/test-glob
$ mkdir dir
$ touch file
$ echo *  # shell glob
dir file
$ echo */
dir/
>>> from glob import glob
>>> glob("*")
['file', 'dir']
>>> glob("*/")
['dir/']
>>> 
>>> from pathlib import Path
>>> list(Path(".").glob("*"))
[PosixPath('file'), PosixPath('dir')]
>>> list(Path(".").glob("*/"))
[PosixPath('file'), PosixPath('dir')]

https://bugs.python.org/issue22276

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed tests, the documentation update, a news and What's New entries.

And don't forget about an alternate separator on Windows.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@e-kwsm e-kwsm force-pushed the fix-pathlib.Path.glob branch 3 times, most recently from fdc8e41 to 187aaf0 Compare November 10, 2018 22:21
@e-kwsm
Copy link
Contributor Author

e-kwsm commented Nov 14, 2018

I didn't expect the Spanish Inquisition.

@bedevere-bot
Copy link

Nobody expects the Spanish Inquisition!

@serhiy-storchaka: please review the changes made to this pull request.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to use sep and altsep (and corresponding pathlib attributes) instead of hardcoding '/' in the general code.

@e-kwsm e-kwsm force-pushed the fix-pathlib.Path.glob branch from 187aaf0 to dcae663 Compare November 17, 2018 19:18
@e-kwsm
Copy link
Contributor Author

e-kwsm commented Nov 17, 2018

@serhiy-storchaka Thank you for your advice.

@e-kwsm
Copy link
Contributor Author

e-kwsm commented Nov 17, 2018

BTW, I address bpo-33392 (Path.glob and Path.rglob) rather than [bpo-22276](https://bugs.python.org/issue) (PurePath.match) in this PR; the former is closed as duplicate of the latter.

@csabella csabella requested a review from serhiy-storchaka May 29, 2019 13:26
@csabella csabella requested review from serhiy-storchaka and pitrou and removed request for serhiy-storchaka May 25, 2020 14:40
@e-kwsm
Copy link
Contributor Author

e-kwsm commented Jan 5, 2021

ping

@e-kwsm e-kwsm requested a review from brettcannon as a code owner April 2, 2022 13:51
@brettcannon
Copy link
Member

/cc @barneygale

@barneygale
Copy link
Contributor

Hey @e-kwsm, sorry for the long wait. Are you still interested in working on this PR? If so I can help review.

@e-kwsm
Copy link
Contributor Author

e-kwsm commented Apr 8, 2022

Are you still interested in working on this PR?

Yes.

@ghost
Copy link

ghost commented Apr 13, 2022

All commit authors signed the Contributor License Agreement.
CLA signed

@arhadthedev
Copy link
Member

@e-kwsm The bot is broken and is being fixed so no worries.

@barneygale
Copy link
Contributor

I don't think the code should make additional is_dir() calls, as those cause a stat() under the hood. The "selector" classes already have this information, it's just a case of feeding it through to the _TerminatingSelector. What do you think @e-kwsm?

@brettcannon brettcannon reopened this Apr 25, 2022
@serhiy-storchaka
Copy link
Member

You can also press "Update branch" to trigger testing.

@serhiy-storchaka serhiy-storchaka self-requested a review April 26, 2022 06:08
@serhiy-storchaka
Copy link
Member

I propose simpler patch:

diff --git a/Lib/pathlib.py b/Lib/pathlib.py
index 4763ab54f6..a759db1684 100644
--- a/Lib/pathlib.py
+++ b/Lib/pathlib.py
@@ -281,7 +281,9 @@ def make_uri(self, path):
 def _make_selector(pattern_parts, flavour):
     pat = pattern_parts[0]
     child_parts = pattern_parts[1:]
-    if pat == '**':
+    if not pat:
+        return _TerminatingSelector()
+    elif pat == '**':
         cls = _RecursiveWildcardSelector
     elif '**' in pat:
         raise ValueError("Invalid pattern: '**' can only be an entire path component")
@@ -943,6 +945,8 @@ def glob(self, pattern):
         drv, root, pattern_parts = self._flavour.parse_parts((pattern,))
         if drv or root:
             raise NotImplementedError("Non-relative patterns are unsupported")
+        if pattern[-1] in (self._flavour.sep, self._flavour.altsep):
+            pattern_parts.append('')
         selector = _make_selector(tuple(pattern_parts), self._flavour)
         for p in selector.select_from(self):
             yield p
@@ -956,6 +960,8 @@ def rglob(self, pattern):
         drv, root, pattern_parts = self._flavour.parse_parts((pattern,))
         if drv or root:
             raise NotImplementedError("Non-relative patterns are unsupported")
+        if pattern[-1] in (self._flavour.sep, self._flavour.altsep):
+            pattern_parts.append('')
         selector = _make_selector(("**",) + tuple(pattern_parts), self._flavour)
         for p in selector.select_from(self):
             yield p

@brettcannon
Copy link
Member

You can also press "Update branch" to trigger testing.

You're right, but I was also trying to get rid of the old CLA check.

@e-kwsm e-kwsm force-pushed the fix-pathlib.Path.glob branch from 7c7f251 to 937b12a Compare April 27, 2022 11:17
@e-kwsm
Copy link
Contributor Author

e-kwsm commented Apr 27, 2022

@serhiy-storchaka Thank you for your review. I have reflected your suggestion.

Copy link
Contributor

@barneygale barneygale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@brettcannon
Copy link
Member

@e-kwsm would you mind signing our new digital CLA (see the "details" link in the failing status check)?

@e-kwsm
Copy link
Contributor Author

e-kwsm commented Apr 28, 2022

would you mind signing our new digital CLA (see the "details" link in the failing status check)?

Done.

@brettcannon brettcannon merged commit ea2f5bc into python:main Apr 28, 2022
@brettcannon
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants