Description
Per discussion on issue #33 (from here down), GoAWK handles CR LF (Windows) line endings differently from gawk (I haven't tried awk or mawk). GoAWK doesn't include the CR in the field (because it's part of the line ending), whereas Gawk does. I'm not sure if there are differences between Gawk's handling on Windows and Linux.
I kinda think the GoAWK approach is more sensible and platform-native, but consistency with other AWKs is good too ... worth thinking about further.
Arnold Robbins said this:
Gawk is consistent . RS has the default value of \n and that is what terminates records. As far as gawk is concerned, the \r is no different from any other character, which is why it appears as part of the last field in the record.
That said, on Windows, I believe the default is to work in text mode, in which case gawk never sees the \r\n line ending, it only sees \n. One can use BINMODE to force gawk to see those characters, in which case you would need to set RS = "\r?\n" in order to get correct processing.
Take the Windows advice with a grain of salt. I have not used a Windows system directly in over two years, and when I did I used Cygwin, so some experimentation may be in order.
If one is processing a Windows file on Linux, then one should use a utility like dos2unix on the file, or tr, before sending the data to GoAwk, which does not (yet! hint, hint) allow RS to be a regular expression. Using GoAwk on Windows, well, you'll have to figure out what the Go runtime is handing off to your code.