I have hashtags (#
) in some of my string fields in a CSV file. It looks like that R has problems with it.
csv = "A;B;C
n;# 9;0
n;1;0"
read.table(text=csv, header=TRUE, sep=";", encoding="UTF-8")
Results in
Fehler in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
line 1 did not have 3 elements
The CSV file is generated by Python using the csv,QUOTE_MINIMAL
style. IT means that string fiels are only enclosed with quotes if necessary (e.g. when the string itself contains a quote char). There is no way to change that. So I have to deal with the # on the R side.
CodePudding user response:
read.table
treats hash as comment by default. Change comment.char
to any other value to change that.
read.table(text=csv, header=TRUE, sep=";", encoding="UTF-8", comment.char = '@')
# A B C
#1 n # 9 0
#2 n 1 0
And that is why you should use read.csv()
instead of read.table()
. The first is the latter but with defaults making more sense for CSV files.