Why would an extra column (unnamed: 0) appear after saving the df and then reading it through pd.rea-CodePudding

My code to save the df is:

fdi_out_vdem.to_csv("fdi_out_vdem.csv")

To read the df into python is :

fdi_out_vdem = pd.read_csv("C:/Users/asus/Desktop/classen/fdi_out_vdem.csv")

The df:

Unnamed: 0	country_name	value
1	Spain	190
2	Spain	311

CodePudding user response：

Your df has two columns, but also an index with "0" and "1". When writing it to csv it looks like this:

,country_name,value
0,Spain,190
1,Spain,311

When importing it with pandas you it is considered as df with 3 columns (and the first has no name)

You have two possibilities here: Save it without index column:

df.to_csv("fdi_out_vdem.csv", index=False)

df = pd.read_csv("C:/Users/asus/Desktop/classen/fdi_out_vdem.csv")

or save it with index column and define an index col when reading it with pd.read_csv

df.to_csv("fdi_out_vdem.csv")

df = pd.read_csv("C:/Users/asus/Desktop/classen/fdi_out_vdem.csv", index_col=[0])

UPDATE
As recommended by @ouroboros1 in the comments you could also name your index before saving it to csv, so you can define the index column by using that name

df.index.name = "index"
df.to_csv("fdi_out_vdem.csv")

df = pd.read_csv("C:/Users/asus/Desktop/classen/fdi_out_vdem.csv", index_col="index")

CodePudding user response：

You can either pass the parameter index_col=[0] to pandas.read_csv :

fdi_out_vdem = pd.read_csv("C:/Users/asus/Desktop/classen/fdi_out_vdem.csv", index_col=[0])

Or even better, get rid of the index at the beginning when calling pandas.DataFrame.to_csv:

fdi_out_vdem.to_csv("fdi_out_vdem.csv", index=False)