Reassigning Pandas Series from float to int not working-CodePudding

I have a very simple issue...

I am working with a CSV file, for some reason when I open it, once of the columns comes out as a float, whci is not in the original file. It also gives me 500 NaN rows, which is also inconsistent with the csv file. I drop the NAs, convert to int and it al seems good, until I reassign it back and it goes back to float. First time for me. (well, I have a lot of first times, but...)

Thanks in advance!

Cheers!

df['ID'] #returns a float.

Returns -

df['ID'].dropna().astype(int)

Returns -

df['ID'] = df['ID'].dropna().astype(int)

Returns -

CodePudding user response：

You are assigning a serie to a column so NaN values are being kept and Nan in Pandas is of a type float, try this:

df.dropna(subset=["ID"],inplace= True)
df["ID"] = df["ID"].astype(int)

print(df)

    ID
0   1
1   2
2   3

CodePudding user response：

Try a reassining to temp df and replace in orginal df.

df_temp  = df['id'].dropna().astype(int)
df['id'] = df_temp
print(df)

CodePudding user response：

I think that dropna() returns a DataFrame, not a column, so df=df['ID'].dropna().astype(int).reset_index(drop=True) should solve the problem