save dataframe as csv correctly-CodePudding

Initially, I have this dataframe:

I save this as a csv file by using:

df.to_csv('Frequency.csv')

The problem lies with when I try to read the csv file again with:

pd.read_csv("Frequency.csv")

The dataframe then looks like this:

Why is there an extra column added and why did the index change? I suppose it has something the do with the way how you should save the dataframe as a csv file, but I am not sure.

CodePudding user response：

That's because of the index.

Try to pandas.DataFrame.reset_index when saving the .csv :

df.reset_index().to_csv('Frequency.csv')

CodePudding user response：

It's writing the index into the csv, so then when you load it, it's an unnamed column. You can get around it by writing the csv like this.

df.to_csv('Frequency.csv', index=False)

CodePudding user response：

Use these to save and read:

#if you don't want to save the index column in the first place
df.to_csv('Frequency.csv', index=False) 
# drop the extra column if any while reading
pd.read_csv("Frequency.csv",index_col=0)

Example :

import pandas as pd

data = {
  "calories": [420, 380, 390],
  "duration": [50, 40, 45]
}
df1 = pd.DataFrame(data)

df1.to_csv('calories.csv', index=False)
pd.read_csv("calories.csv",index_col=0)

I've used the combination of these 2 given below because my jupyter notebook adds index on it's own while reading even if I use index=False while saving. So I find the combo of these 2 as a full proof method.

df1.to_csv('calories.csv', index=False)
pd.read_csv("calories.csv",index_col=0)