Home > other >  Pandas saving and loading to CSV without introducing rounding error
Pandas saving and loading to CSV without introducing rounding error

Time:09-29

This script:

import numpy as np
import pandas as pd

#

x = 10000 * np.pi

df = pd.DataFrame({"test": [x]})

df.to_csv("pd_test.csv")

other_df = pd.read_csv("pd_test.csv")

print(df["test"][0], other_df["test"][0])
print(df["test"][0] - other_df["test"][0])

Gives:

31415.926535897932 31415.92653589793
3.637978807091713e-12

I would like to not introduce a change when saving and loading to CSV, if possible - for example, is there a datatype I can use for the dataframe which would accomplish this?

I don't mind losing a small amount of accuracy if necessary, I would just like to avoid the change during the save and load process if possible.

CodePudding user response:

There are two alternatives. You can round up your float with round()

x = 10000 * np.pi
print(round(x,2))
output = 31415.93

or use .format()

print("{:.2e}".format(x))
output = 3.14e 04
print("{:.2f}".format(x))
output = 31415.93

CodePudding user response:

I ended up casting the dataframes to float32 before save and on load:

import numpy as np
import pandas as pd

#

x = 10000 * np.pi

df = pd.DataFrame({"test": [x]})

df = df.astype('float32')

df.to_csv("pd_test.csv")

other_df = pd.read_csv("pd_test.csv").astype('float32')

print(df["test"][0], other_df["test"][0])
print(df["test"][0] - other_df["test"][0])
  • Related