To delete lines in a CSV file
that have empty cells - I use the following code:
import pandas as pd
data = pd.read_csv("./test_1.csv", sep=";")
data.dropna()
data.dropna().to_csv("./test_2.csv", index=False, sep=";")
everything works fine, but I get a new CSV file
with incorrect data
:
what is highlighted in red squares
I get additional signs in the form of a dot
and a zero
.0
.
Could you please tell me how do I get correct data without .0
Thank you very much!
CodePudding user response:
Pandas represents numeric NAs
as NaNs
and therefore casts all of your ints
as floats
(python int
doesn't have a NaN
value, but float
does).
If you are sure that you removed all NAs
, just cast your columns/dfs to int
:
data = data.astype(int)
If you want to have integers and NAs, use pandas nullable integer types such as pd.Int64Dtype()
.
more on nullable integer types: https://pandas.pydata.org/pandas-docs/stable/user_guide/integer_na.html