Skip to content

CSV reader should support QUOTE_NOTNULL and QUOTE_STRINGS #113732

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
serhiy-storchaka opened this issue Jan 5, 2024 · 8 comments
Closed

CSV reader should support QUOTE_NOTNULL and QUOTE_STRINGS #113732

serhiy-storchaka opened this issue Jan 5, 2024 · 8 comments
Labels
3.13 bugs and security fixes type-bug An unexpected behavior, bug, or error

Comments

@serhiy-storchaka
Copy link
Member

serhiy-storchaka commented Jan 5, 2024

Feature or enhancement

New quoting rules QUOTE_NOTNULL and QUOTE_STRINGS were introduced in #67230. But they only affect CSV writer, not CSV reader. I think that they should affect CSV reader in the same way as QUOTE_NONNUMERIC does.

  • QUOTE_NOTNULL -- unquoted empty strings are returned as None.
  • QUOTE_STRINGS -- unquoted empty strings are returned as None and unquoted numeric strings are returned as float.

It is perhaps too late to change this in 3.12, so it can be considered as a new feature in 3.13.

Linked PRs

@serhiy-storchaka serhiy-storchaka added type-feature A feature request or enhancement 3.13 bugs and security fixes labels Jan 5, 2024
@serhiy-storchaka
Copy link
Member Author

On other hand, it is the documented behavior. So this can be considered as a bugfix.

  • It is a bug fix in 3.12.2.
  • It is a new feature in 3.13. 3.12.2 needs a documentation fix, and the change between 3.12 and 3.13 is breaking.

@serhiy-storchaka serhiy-storchaka added type-bug An unexpected behavior, bug, or error 3.12 only security fixes and removed type-feature A feature request or enhancement labels Jan 5, 2024
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Jan 5, 2024
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Jan 5, 2024
@merwok

This comment was marked as outdated.

@serhiy-storchaka
Copy link
Member Author

serhiy-storchaka commented Jan 5, 2024

I think that it is less disruptive than alternatives. For now, not much people use 3.12, and even less use new quoting options. But the mass adaptation of 3.12 will start before releasing 3.13, and at that time more code can be broken by 3.13 changes.

  1. Fix this in 3.12.2. It is only disruptive for these who use new quoting options in 3.12.0 and 3.12.1.
  2. Change the behavior in 3.13 without warning. It is disruptive for all users of 3.12, and also for users of 3.13 who occasionally run their code on the contemporary supported 3.12.
  3. Emit a FutureWarning in 3.13 and 3.14 and change the behavior in 3.15. It is how such changes should be done (when change an established behavior), but it increases the number of affected people and delays the proper solution.
  4. Emit a FutureWarning in 3.12.2+ and 3.13 and change the behavior in 3.14.
  5. Emit a FutureWarning in 3.12.2+ and change the behavior in 3.13.

It was better to do this before 3.12.0, but it was not done, and now we have no good options. The longer we delay the solution, the worse options we have. cc @Yhg1s

@merwok
Copy link
Member

merwok commented Jan 5, 2024

Ah, I missed that the options were introduced in 3.12! Option 1 to fix in 3.12.2 seems great then. Great that you tagged the release manager to confirm.

@ambv ambv changed the title CVS reader should support QUOTE_NOTNULL and QUOTE_STRINGS CSV reader should support QUOTE_NOTNULL and QUOTE_STRINGS Jan 5, 2024
@Yhg1s
Copy link
Member

Yhg1s commented Jan 16, 2024

We should consider observable behaviour, rather than documentation, to be leading. Changing the behaviour in 3.12.2 is too disruptive. Patch releases also should not add new warnings. Given that it was documented to behave this way, I don't think it's necessary to add a new warning for this. That it doesn't work as documented is a bug we can't fix in 3.12, but we can in 3.13.

@merwok
Copy link
Member

merwok commented Jan 30, 2024

That it doesn't work as documented is a bug we can't fix in 3.12, but we can in 3.13.

Would it be useful to add a note in the 3.12 docs?

aisk pushed a commit to aisk/cpython that referenced this issue Feb 11, 2024
@Prometheus3375
Copy link
Contributor

Prometheus3375 commented Mar 11, 2024

Adding a note to the documentation will be useful indeed. Was just writing some .csv involved code and because of the documentation of new quoting constants increased python version. Then I realized that these constants does not work for reading as documentation suggests. Hopefully, I found this issue.

Actually, instead of the note of bugged behavior maybe it will be better to simply remove reader instruction in 3.12 documentation and then add in 3.13 as an addition.

Also, one of the None is not properly formatted.
image

@merwok
Copy link
Member

merwok commented Mar 12, 2024

instead of the note of bugged behavior maybe it will be better to simply remove reader instruction in 3.12 documentation

I don’t think so: it is common to read a newer version of the docs than the Python version you’re using. It’s best to be clear about the behaviour and the change here.

Does someone want to open a pull request for this?
If not, we should open a doc ticket to make sure this doesn’t get forgotten.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes type-bug An unexpected behavior, bug, or error
Projects
Status: Done
Development

No branches or pull requests

4 participants