I have a csv file with the following single line:
Some,Test,"is"thisvalid,or,not
according to cvslint.io
it's not valid csv:
However according to https://www.toolkitbay.com/tkb/tool/csv-validator
it is valid csv. Which site is lying?
CodePudding user response:
Whether it is "valid" or not depends on the definition you, and the websites you found, are using. If you asked about "well-formed XML", everyone would agree that should be based on the W3C standard; or "valid HTML" would now probably refer to the WHATWG Living Standard. "Valid CSV" has no such universal definition - although there are standards for CSV, they've been written after years of use, in the rather optimistic hope that existing implementations will be amended to follow them.
So neither tool is "lying", they just evidently disagree on what "valid" means.
A far more useful question than if CSV is "valid" is whether it is interpreted as you want by whatever tool you try to process it with. From a practical point of view, it's likely that the unusual positioning of quote marks might be interpreted differently by different tools, so is probably best avoided if interoperability is relevant to your use case.
CodePudding user response:
For CSV format the reference is this https://datatracker.ietf.org/doc/html/rfc4180
And you have:
Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes.
If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote
Then your CSV is not valid.
If your columns are these
------ ------ --------------- ---- -----
| 1 | 2 | 3 | 4 | 5 |
------ ------ --------------- ---- -----
| Some | Test | "is"thisvalid | or | not |
------ ------ --------------- ---- -----
then valid version is this
Some,Test,"""is""thisvalid",or,not
And it's valid also for https://csvlint.io/