Use C-style IO instead of ifstream for parsing#193
Use C-style IO instead of ifstream for parsing#193ToruNiina merged 4 commits intoToruNiina:masterfrom
Conversation
|
Hello, @ToruNiina, any feedback? I see the Windows CI has failed, unfortunately I am not able to test on Windows. |
|
This problem is difficult to reproduce and I haven't actually reproduced it, but I assume that this problem will occur (under very limited conditions). I looked several well-known (than mine) c++ libraries that parses a file and found that most of the libraries uses fstream internally after taking a filename. Some supports a Also, there are some trouble around So I think it is safer to use fstream if a filename is passed and add a new function to receive a |
|
The goal of the PR is not to have a In your code, you check the After the stream is open, you can set the exception mask via the The unfortunate conclusion I have from this is that the I'm not at all happy with using Of course, everything works fine as long as you're not getting errors, and if you do, especially for an advanced user it's usually not hard to figure out what's wrong with reading a file (not found and permissions are by far the most common). But (especially) for a newbie user of a tool (that uses toml11 to work with some of its files), getting a concrete error message is much more helpful. And in the <2% of some more cryptic errors, having a concrete error message is very valuable as it can save a ton of debugging. |
|
@ToruNiina would you please respond to my last comment? I find poor error handling a serious issue which is important for deciding whether to even use a library or not. |
ToruNiina
left a comment
There was a problem hiding this comment.
I agree with you about we need to improve the error messages regarding file I/O.
So, the problem here is about a FILE*.
The policy of using FILE* has some problems, like
- functions such as fopen are sometimes problematic on non-ASCII environments. So, additional environment-dependent macros must be included, which makes the code less readable.
- Since
filesystem::pathis aware of non-ASCII environments, it is also sometimes problematic when we try to usefilesystem::pathandfopenat the same time. - Most C++ libraries use
fstream, so usingFILE*violates the principle of least surprise. - And this code currently does not run on Windows in CI. (You asked me to fix it, but my primary environment is linux and secondary is macOS. Debugging on windows is not easy for me either.)
I think the advantage of a FILE pointer is errno. But, practically, all the libstdc++, libc++, and microsoft/STL uses FILE* inside and standard library does not set errno zero, so the errno corresponding to the error that occurs inside will be kept at the point where we find stream operation fails even if we use fstream. I said "practically" because standard spec does not say anything about errno in the Input/Output section. But since all the most well known implementation uses this, I think we can assume that errno can be obtained. People who use in-house c++ standard library that implements fstream without FILE* may get a useless error message. But it does not differ from the current state.
So, to me, the advantage seems not to be large enough to ignore the disadvantage. If you use fstream and errno in your patch, I will merge it.
This is too vague, sorry I'm not aware of the issues, I assume as long as you use the same encoding as the FS is using, it should work? Though, just mentioning it for completeness, since you seem to dislike the solution in general, we maybe don't need to go down this path any further 🙂.
I don't understand this argument at all. What is the least surprise useful for? The
This is indeed quite problematic.
Since it's not a part of the specification, it's only an assumption that
I'd say it has issues which could be solved, and then the advantages would outweigh the disadvantages, but it's ultimately your code. So a fallback solution, since I need reasonable error handling, would be to read the files myself and just pass the string to toml11. I can do it using a I was wondering if it would be possible to add an interface that would accept an rvalue of the data (along with a const reference for a more well-rounded API), so that it wouldn't need to do the copy. You're using |
|
The reason why we use Note that, as I said before, we also have an option to add a
|
You are, however, effectively working with a string. Even if you don't use any
Not only impractical, most people use
Yes, the signature for that function would be the same, we'd need to call it differently, e.g.
You think there won't be any further issues on Windows besides the For me, the two options ( |
|
I prefer |
The fstream classes are notorious for their non-existent error handling. This adds a C-style fILE * IO (fopen(), etc.) alternative interface, so that if a user needs reliable error handling, they can use that, albeit more inconvenient, but more robust approach.
Set the exceptions mask so that exceptions are thrown when an I/O error occurs. Also throw the same exception type when the opening fails.
9d58293 to
6c2c804
Compare
|
It appears the CI needs approval. I've made the change to have the I've also made small improvements to the |
|
Curious the CI has passed on Windows, even with the @ToruNiina what do you think about the new approach? |
|
Another gentle ping. @ToruNiina please review if time at all allows, we kind of need reliable error handling... |
|
It looks good to me. I have merged it. If you use Also, in some countries (including where I live), windows uses character encoding that was developed/specified in that country by default (though these days it seems to be able to change it to UTF-8 via settings). It makes the situation more complicated because character encoding of the filesystem (UTF-16) can differ from that of the source code file. |
The
fstreamclasses are notorious for their non-existent error handling. The stream error bits weren't even checked in the code, resulting in arbitrary malfunctions or crashing in case of an error.This replaces the
ifstreamwith C-style file IO (fopen(), etc.) and adds an exception class that is thrown in case of errors, including a descriptive error message.It preserves the
parse()variant for an istream, though I'd say using it with an ifstream is for the large part detrimental.I'm introducing a new
file_io_errorexception class, let me know if it needs adjusting. It inherits fromstd::runtime_error, not fromtoml::exception, as that didn't seem to be a good fit. On a side note I think the whole exception class hierarchy should inherit fromstd::runtime_error, mainly to preserve a distinction fromstd::logic_error, as the API caller may or may not want to catch those.As for how I've run into this, I've had the
ifstreamcode crash on me due to mistakenly passing a directory path on a tmpfs filesystem. It failed on theassert(fsize >= 0);and left me dumbfounded for a while. I can't easily add that to the unittests, but the new code presents a reasonable error message for that case.