Home > Net >  save dataframe as csv correctly
save dataframe as csv correctly

Time:11-30

Initially, I have this dataframe:

dataframe 1

I save this as a csv file by using:

df.to_csv('Frequency.csv')

The problem lies with when I try to read the csv file again with:

pd.read_csv("Frequency.csv")

The dataframe then looks like this:

dataframe 2

Why is there an extra column added and why did the index change? I suppose it has something the do with the way how you should save the dataframe as a csv file, but I am not sure.

CodePudding user response:

That's because of the index.

Try to pandas.DataFrame.reset_index when saving the .csv :

df.reset_index().to_csv('Frequency.csv')

CodePudding user response:

It's writing the index into the csv, so then when you load it, it's an unnamed column. You can get around it by writing the csv like this.

df.to_csv('Frequency.csv', index=False)

CodePudding user response:

Use these to save and read:

#if you don't want to save the index column in the first place
df.to_csv('Frequency.csv', index=False) 
# drop the extra column if any while reading
pd.read_csv("Frequency.csv",index_col=0)

Example :

import pandas as pd

data = {
  "calories": [420, 380, 390],
  "duration": [50, 40, 45]
}
df1 = pd.DataFrame(data)

df1.to_csv('calories.csv', index=False)
pd.read_csv("calories.csv",index_col=0)

I've used the combination of these 2 given below because my jupyter notebook adds index on it's own while reading even if I use index=False while saving. So I find the combo of these 2 as a full proof method.

df1.to_csv('calories.csv', index=False)
pd.read_csv("calories.csv",index_col=0)
  • Related