Important initial information: these values are ID's, they are not calculation results, so I really don't have a way to change the way they are saved in the file.
Dataframe example:
datetime | match_name | match_id | runner_name | runner_id | ... |
---|---|---|---|---|---|
2022/01/01 10:10 | City v Real Madrid | 1.199632310 | City | 122.23450 | ... |
2021/01/01 01:01 | Celtic v Rangers | 1.23410 | Rangers | 101.870 | ... |
But the match_id
in the Dataframe appears:
1.19963231
1.2341
And runner_id
in the Dataframe appears:
122.2345
101.87
I tried to pass all values as string so it would see the numbers as string and not remove the zeros:
df = pd.read_csv(filial)
df = df.astype(str)
But it didn't help, he kept removing the zero on the right.
I am aware of the existence of float_format
but in this case it is necessary to specify the number of decimal places to be used, so I could not use it and as they are ID's I cannot take the risk of a very large value being rounded.
Note: there are hundreds of different columns.
CodePudding user response:
By the time your data is read, the zeros are already removed, so your conversion to str
can no longer help.
You need to pass the option directly to read_csv()
:
df = pd.read_csv(filial, dtype={'runner_id': str})
If you have many columns like this, you can set dtype=str
(instead of a dictionary), but then all your columns will be str
, so you need to re-parse each of the interesting ones as their correct dtype (e.g. datetime
).
More details in the docs ; maybe play with converters
param too.