Home > Net >  Pandas Reading csv file with " in the data
Pandas Reading csv file with " in the data

Time:07-17

I want to parse CSV file but the data look like in the below. While using separator as ," it does not distribute file correctly to the columns. Is there any way to ignore " or escaping with regex?

3,"Gunnar Nielsen Aaby","M",24,NA,NA,"Denmark","DEN" 4,"Edgar Lindenau Aabye","M",34,NA,NA,"Denmark/Sweden" 5,"Christine Jacoba Aaftink","F",21,185,82,"Netherlands" 5,"Christine Jacoba Aaftink","F",21,185,82,"Netherlands" 6,"Per Knut Aaland","M",31,188,75,"United States","USA"

Thanks ins advance

CodePudding user response:

Reading the csv file (assuming no new line between the rows):

with open('data') as f:
    raw = f.readline()

Some splitting and processing:

data = []
for r in raw.split('\" '):
    data.append((r '"').split(','))

Creating the final dataframe:

df = pd.DataFrame(data)
df

Output:

enter image description here

  • Related