-
-
Notifications
You must be signed in to change notification settings - Fork 32k
configparser accepts invalid keys and sections when writing #65697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
When adding something to a configparser instance which has a key beginning with a comment char, it writes the data to a file without generating an error, and when reading the file back obviously the data is different as it's a comment: >>> cp = configparser.ConfigParser()
>>> cp.read_dict({'DEFAULT': {';foo': 'bar'}})
>>> cp.write(sys.stdout)
[DEFAULT]
;foo = bar This was discussed on python-dev here: Of course there are other corner cases as well, like having a key like "[foo]" or "=bar". I think whatever data I pass into a configparser should also come out again when reading the file back. Since there's no escaping in configparser, I think the ideal solution would be configparser refusing to write ambigious values. While this is technically a backwards-incompatible change, applications doing this were broken in the first place, so validation while writing will not break anything. Validating when setting values would be better of course, but this can potentially break applications where configparser is used without actually writing a file. |
Any update on this? Discussion over Python-Dev looks like finished without a consensus/resolution. |
#69909 is similar - and much more detailed. |
See also #69909, #67490, #65122, where the prior consensus seems to have been to be lenient to what the user provides, and leave it up to them to provide valid inputs. I can see the flip side that it would be valuable to provide some level of validation and reasonableness checks. However, it appears there are cases where other parsers might accept inputs that Python's configparser might not accept (e.g. section names like I've re-classified this as a feature and not a bug as the current approach is stable and defended and any change is going to be some kind of enhancement. I think the next thing that needs to happen is to draft a design that addresses the concerns laid out in the various discussions. Such a design should probably start by synthesizing all of the concerns brought up in these different threads (the issues mentioned above and other discussions linked from those). An LLM might be helpful in performing such a synthesis. |
I'm happy to sponsor this work (guide, direct, mentor), but I won't be working on it myself and I don't have a strong opinion on the design. The key is going to be to garner consensus. |
A debatable feature but not a bug, granted.
Classifying this entire issue as a feature does not make sense to me... The demoes I gave in in:
... both create very basic, valid It's pretty common to turn everything upside and down and rewrite everything to fix a collection of bugs all at once (many developers love to write new code and prefer that sort of approach), yet that does not make each of those bugs a "feature". As mentioned in #69909, it is possible to use |
This appears to have been fixed. The regex for parsing section names (RawConfigParser._SECT_TMPL = >>> import configparser
>>> import sys
>>> cfg = configparser.ConfigParser()
>>> cfg.read_dict({'T[e]st' : {'foo' : 'bar'}})
>>> cfg.write(sys.stdout)
[T[e]st]
foo = bar
>>> with open('example.ini', 'w') as fp:
... cfg.write(fp)
...
>>> file2cfg = configparser.ConfigParser()
>>> with open('example.ini', 'r') as fp:
... file2cfg.read_file(fp)
...
>>> file2cfg.write(sys.stdout)
[T[e]st]
foo = bar
>>>
Since our primary concern is writing and reading back different data, I think we should only sanitize/reject data in the write() function, not prevent users from setting potentially bad data in the configparser. As far as I understand, the remaining concerns are as follows: line comments are not escaped, newlines are not escaped (which allows for injection), and key names can contain delimiters which corrupt the result of a following read. On comments, I'm not sure what we'd gain from escaping them aside from making it more difficult to comment files programmatically. I'd suggest leaving existing functionality as is: leading comment prefixes in keys and trailing prefixes in values cause the parser to ignore the rest of the line, otherwise they are read back. >>> import configparser
>>> import sys
>>> cfg = configparser.ConfigParser()
>>> cfg.read_dict({'test' : {';foo' : 'bar', '\;foo' : 'bar'}})
>>> cfg.set('test', 'c;', 'd')
>>> cfg.set('test', 'e', 'f ; this is a comment')
>>> cfg.read_dict({';test2' : {'a' : 'b'}}
>>> cfg.write(sys.stdout)
[test]
;foo = bar
\;foo = bar
c; = d
e = f ; this is a comment
[;test2]
a = b
>>> with open('example.ini', 'w') as fp:
... cfg.write(fp)
...
>>> file2cfg = configparser.ConfigParser(inline_comment_prefixes=(';', '#'))
>>> with open('example.ini', 'r') as fp:
... file2cfg.read_file(fp)
...
>>> file2cfg.write(sys.stdout)
[test] #note the parser skipped the first element since it was a comment
\;foo = bar
c; = d
e = f
[;test2]
a = b Things get interesting with newlines. Using marc’s example in (#69909 (comment)) but escaping the newline in ‘[privileged_section]\nevil’ produces:
Which reads back as:
It seems escaping newlines is not sufficient to defend against this particular attack, but escaping the leading ‘[‘ does prevent a malicious user from adding new sections in this manner. On delimiters, since they are user-defined and cannot necessarily be escaped, I'd suggest creating a new InvalidKeyError class and raising one if a user attempts to write a key with a delimiter to a file. This would halt file writing since write() does not try to catch an error on _write_section(), which is probably fine. |
Fair enough. I'm not opposed to there being selective fixes for uncontested flaws. |
…ead (#129270) --------- Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
…erly read (python#129270) --------- Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
The text was updated successfully, but these errors were encountered: