Home > database >  Pandas csv write not understanding empty space inside double quotes
Pandas csv write not understanding empty space inside double quotes

Time:11-01

I'm currently trying to read a csv file, add/delete/rename some columns using Pandas dataframe, and then write it back to the same file. However, a few of the files I'm using contain records which look like this:

Column 1,Column 2,Column 3,Column 4
123," ",ABCD,"This, that, and this"

I was able to successfully get Pandas to understand "This, that, and this" and write it back to the csv exactly the same with the quotation marks and commas. But unfortunately I can't seem to get the empty space surrounded by quotations " ". It will just write it back like this:

Column 1,Column 2,Column 3,Column 4
123, ,ABC,"This, that, and this"

My read looks like:

f = pd.read_csv((mypath   file), skipinitialspace=True, quotechar='"')

And my write looks like:

f.to_csv((mypath   file), index=False)

CodePudding user response:

It's still a valid CSV without the quotes around the space. A field only needs to be quoted if it contains the separator (comma).

You don't need skipinitalspace=True on the read. That's only needed if the CSV looks like A, B, C, D instead of A,B,C,D.

The quotechar parameter isn't needed either as " is the default.

The best you can do with pandas is quote everything if you need it:

import pandas as pd
import csv

f = pd.read_csv('sample.csv')
f.to_csv('out.csv',index=False, quoting=csv.QUOTE_ALL)

Given: sample.csv

Column 1,Column 2,Column 3,Column 4
123," ",ABCD,"This, that, and this"

Result: out.csv

"Column 1","Column 2","Column 3","Column 4"
"123"," ","ABCD","This, that, and this"

CodePudding user response:

as far as I know, you can use quoting in to_csv to quote all strings

import csv
import pandas as pd

f = pd.read_csv(('test.csv'), skipinitialspace=True, quotechar='"')
f.to_csv(('test2.csv'), quoting=csv.QUOTE_NONNUMERIC, index=False)

and the output like this

"Column 1","Column 2","Column 3","Column 4"
123," ","ABCD","This, that, and this"

or you can replace blank space with '" "' but the output will contain triple quote like """"""

f = pd.read_csv(('test.csv'), skipinitialspace=True, quotechar='"')
f["Column 2"] = f["Column 2"].replace(r'^\s*$', '\" \"', regex=True)
f.to_csv(('test2.csv'), index=False)

and the output like this

Column 1,Column 2,Column 3,Column 4
123,""" """,ABCD,"This, that, and this"
  • Related