Skip to content

Commit fd133d4

Browse files
barneygalezooba
andauthored
GH-126601: pathname2url(): handle NTFS alternate data streams (#126760)
Adjust `pathname2url()` to encode embedded colon characters in Windows paths, rather than bailing out with an `OSError`. Co-authored-by: Steve Dower <[email protected]>
1 parent e8bb053 commit fd133d4

File tree

4 files changed

+21
-14
lines changed

4 files changed

+21
-14
lines changed

Doc/library/urllib.request.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,6 +152,11 @@ The :mod:`urllib.request` module defines the following functions:
152152
the path component of a URL. This does not produce a complete URL. The return
153153
value will already be quoted using the :func:`~urllib.parse.quote` function.
154154

155+
.. versionchanged:: 3.14
156+
On Windows, ``:`` characters not following a drive letter are quoted. In
157+
previous versions, :exc:`OSError` was raised if a colon character was
158+
found in any position other than the second character.
159+
155160

156161
.. function:: url2pathname(path)
157162

Lib/nturl2path.py

Lines changed: 10 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ def pathname2url(p):
4040
# C:\foo\bar\spam.foo
4141
# becomes
4242
# ///C:/foo/bar/spam.foo
43+
import ntpath
4344
import urllib.parse
4445
# First, clean up some special forms. We are going to sacrifice
4546
# the additional information anyway
@@ -48,16 +49,13 @@ def pathname2url(p):
4849
p = p[4:]
4950
if p[:4].upper() == 'UNC/':
5051
p = '//' + p[4:]
51-
elif p[1:2] != ':':
52-
raise OSError('Bad path: ' + p)
53-
if not ':' in p:
54-
# No DOS drive specified, just quote the pathname
55-
return urllib.parse.quote(p)
56-
comp = p.split(':', maxsplit=2)
57-
if len(comp) != 2 or len(comp[0]) > 1:
58-
error = 'Bad path: ' + p
59-
raise OSError(error)
52+
drive, tail = ntpath.splitdrive(p)
53+
if drive[1:] == ':':
54+
# DOS drive specified. Add three slashes to the start, producing
55+
# an authority section with a zero-length authority, and a path
56+
# section starting with a single slash.
57+
drive = f'///{drive.upper()}'
6058

61-
drive = urllib.parse.quote(comp[0].upper())
62-
tail = urllib.parse.quote(comp[1])
63-
return '///' + drive + ':' + tail
59+
drive = urllib.parse.quote(drive, safe='/:')
60+
tail = urllib.parse.quote(tail)
61+
return drive + tail

Lib/test/test_urllib.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1429,8 +1429,9 @@ def test_pathname2url_win(self):
14291429
self.assertEqual(fn('C:\\a\\b%#c'), '///C:/a/b%25%23c')
14301430
self.assertEqual(fn('C:\\a\\b\xe9'), '///C:/a/b%C3%A9')
14311431
self.assertEqual(fn('C:\\foo\\bar\\spam.foo'), "///C:/foo/bar/spam.foo")
1432-
# Long drive letter
1433-
self.assertRaises(IOError, fn, "XX:\\")
1432+
# NTFS alternate data streams
1433+
self.assertEqual(fn('C:\\foo:bar'), '///C:/foo%3Abar')
1434+
self.assertEqual(fn('foo:bar'), 'foo%3Abar')
14341435
# No drive letter
14351436
self.assertEqual(fn("\\folder\\test\\"), '/folder/test/')
14361437
self.assertEqual(fn("\\\\folder\\test\\"), '//folder/test/')
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Fix issue where :func:`urllib.request.pathname2url` raised :exc:`OSError`
2+
when given a Windows path containing a colon character not following a
3+
drive letter, such as before an NTFS alternate data stream.

0 commit comments

Comments
 (0)