-
Notifications
You must be signed in to change notification settings - Fork 18k
encoding/csv: cut \r before EOF in input #22937
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Dup of #22746? |
Nope, not an exact duplicate - can reproduce on https://play.golang.org/p/U75TI4ouaG
And, as expected, on 1.9.2:
|
It's not obvious that this is valid since endline terminators should be "\r\n" or "\n". However, this string abruptly ends on "\r". However, it shouldn't be hard preserving the old behavior. |
I naively agree with @dsnet, but I don't know the CSV landscape enough to know what's commonly produced or accepted. |
Essentially, it comes down to the old behavior being equivalent to: line = strings.TrimSuffix(strings.TrimSuffix(line, "\n"), "\r") While the new behavior is equivalent to: if strings.HasSuffix(line, "\r\n") {
line = strings.TrimSuffix(line, "\r\n")
} else {
line = strings.TrimSuffix(line, "\n")
} The old behavior is simpler, but the new behavior is a more correct interpretation of "\r\n" and "\n" as valid newline separators. Note also that we don't treat bare "\r" as newline separators, so it doesn't make sense to me why we should allow it as the trailing marker at EOF. Furthermore, note that the RFC actually says that newline separators should be CRLF, so the fact that we allow bare LF is technically non-compliant. Marking this as "Needs Decision". I'm slightly in favor of saying the new behavior is correct. |
FWIW I don't have a particular problem with the new behaviour. The reason I reported it was to be sure it wasn't changed by mistake and I didn't know if it was correct. Whatever is decided, would an additional unit test be welcome to cover it? |
Interestingly, I actually added a test when this behavior changed: |
We should probably restore the old behavior: a \r before EOF gets cut just like a \r before \n. I think it is debatable whether the old behavior or current behavior is more "correct", and therefore we should err on the side of preserving the behavior from the previous release. |
@dsnet, interested in working on a fix? |
Change https://golang.org/cl/81577 mentions this issue: |
Please answer these questions before submitting your issue. Thanks!
What version of Go are you using (
go version
)?go version devel +3a395e2283 Fri Nov 17 19:00:41 2017 +0000 linux/amd64
Does this issue reproduce with the latest release?
no
What operating system and processor architecture are you using (
go env
)?GOARCH="amd64"
GOBIN=""
GOCACHE="/home/mgarton/.cache/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/mgarton/gopath"
GORACE=""
GOROOT="/home/mgarton/go"
GOTMPDIR=""
GOTOOLDIR="/home/mgarton/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build420727529=/tmp/go-build -gno-record-gcc-switches"
What did you do?
Tried to read a csv with Go tip that worked with Go1.9
If possible, provide a recipe for reproducing the error.
A complete runnable program is good.
A link on play.golang.org is best.
https://play.golang.org/p/AfQJBQ-2JC
What did you expect to see?
The same behaviour as in 1.9
What did you see instead?
parse error on line 1, column 2: extraneous or missing " in quoted-field
The text was updated successfully, but these errors were encountered: