I have a *.txt file with numbers. I want to eliminate the spaces. The raw data looks like this
12 12345 1234
23 23456 234
If I use the following
data=data[0].str.replace(" ","")
data.update('\'' data '\',')
I get
'1234123451234',
'2323456234',
which I want. But if I save it to csv with
data.to_csv("/Users/k/file.txt", header=None, index=None, mode='a')
I get as file values:
"'1234123451234',"
"'2323456234',"
If I use the quoating = csv.None or 3 (same)
data.to_csv("Users/k/file.txt", header=None, index=None, quoting=3, escapechar="\\", mode='a')
The file looks like:
'1234123451234'\,
'2323456234'\,
Just using space or nothing as escapechar
does not work.
If I just remove the spaces without adding quotes or commas and then save via:
data.to_csv("Users/k/file.txt", header=None, index=None, mode='a', quoting=1, sep=",")
I get:
"1234123451234"
"2323456234"
missing the comma.
Adding only the comma and saving as above gets me
"1234123451234,"
"2323456234,"
wrong place :-)
As you can see, I am getting mad over missing my target by inches, while it is most likely super easy. I probably will switch to regex :-)
CodePudding user response:
The output is expected. As you manually added '
to the strings, they are retained in the output. There's no need to manually add quotation marks at all, just set the correct options for to_csv
:
>>> df = pd.DataFrame(["12 12345 1234", "23 23456 234"])
>>> df
0
0 12 12345 1234
1 23 23456 234
>>> df[0] = df[0].str.replace(" ", "")
>>> df
0
0 12123451234
1 2323456234
>>> df.to_csv("output.csv", quoting=1, quotechar="'", line_terminator=',\n')
# output.csv
'','0',
'0','12123451234',
'1','2323456234',
CodePudding user response:
I would suggest not using Pandas at all, given that your transformation is pretty simple.
Something like this should work:
with open('input.txt', 'rt') as f_in:
with open('output.txt', 'wt') as f_out:
for line in f_in:
line = line.strip()
line = line.replace(' ', '')
line = '\'' line '\',\n'
f_out.write(line)