-
Notifications
You must be signed in to change notification settings - Fork 50
Unicode character issue with unicode units (e.g. cm²) #133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Unicode all the things! That said, my worry is that this ascii encoding is actually for compatibility with the underlying |
In addition, when adding a unicode test to try this out, pytest is causing a SegFault. It isn't clear that this is exactly cf-units' fault (pytest-dev/pytest#2875), but it does relate to the fact that a Cython exception is being raised that contains a Example test:
Installing pytest-faulthandler did not solve the issue for me (as was suggested in the linked issue). |
On my travels I found some pretty scary SegFaults from the python interpreter...
There are 19 other issues on PyTest that reference SegFaults... 😱time to track down the issue. |
Workaround, as suggested in pytest-dev/pytest#4406 (comment) is to use [EDIT]: I got the conclusion wrong, it does not work. |
So if I think about it a little bit, this is because pytest tries giving us context about what objects went into each of the calls. The problem is that the constructor of The
On the whole, this is a pretty safe class - you can't construct one unless you are really not playing ball:
However, it isn't as safe as you think when using pytest, because it is trying to be helpful by repr-ing the objects in the stack trace for a failing test:
I will raise this in the pytest issue tracker, but I suspect that the only way for them to avoid this would be to avoid repr-ing the object in the stack trace. This would really hamper the quality of the error log, and as a user of pytest I'd be really disappointed to lose that feature. I'll raise it there in-case there is a solution which would allow some middle-ground (given I don't know the pytest codebase, it may be that there is enough context to know whether the thing was constructed correctly). Given the above paragraph, I conclude that cf-units should implement a safe |
Also, allow a file encoding in the coding standards test, so that we can have some literal unicode characters for testing with.
Also, allow a file encoding in the coding standards test, so that we can have some literal unicode characters for testing with.
* Always treat units as unicode. Closes #133. Also, allow a file encoding in the coding standards test, so that we can have some literal unicode characters for testing with. * Fix date2num test which is incorrectly using repr when str was intended. * Tidy up the Unit constructor for the py2 case, so that it is easier to see that the code can be deleted when the codebase becomes py3 only. * Handle unicode object in py2 specially, and always ensure that py2k returns a non-unicode for __str__ (unless sys.getdefaultencoding says otherwise). * Ensure that the error raised in Unit constructor handles unicode too.
🎉 π m² |
According to the udunits grammar, superscript characters are supported (who knew!).
Looks like this isn't supported in cf_units though:
That
encode('ascii')
is looking pretty shaky here...FWIW, this is quite an esoteric issue - I only found this by studying the UDUNITS grammar carefully - I haven't found this in the wild.
The text was updated successfully, but these errors were encountered: