I would like to be able to use read_json in pandas to read a json file interpreting the columns the same way they were written using to_json. In the example below the 'Dec' column is dtype float64 when it is written with to_json and the json file shows the numbers as floats (1.0, 2.0, etc). However, when it is read with read_json, the column type for 'Dec' in the dataframe ends up being int64. I would like it to be float64 still even when the values happen to all be integer-like. These are using orient='split' if that matters. Is there a way to accomplish this? I'm looking for a general approach, without depending on a specific column name since in practice I'm expecting this to work on many different dataframes.
tmp_file = 'c:/Temp/in_df.json'
in_df = pd.DataFrame([['A', 2.0, 4], ['B', 3.0, 2], ['C', 4.0, 3]], columns=['Key', 'Dec', 'Num'])
dec_column_type_in = in_df['Dec'].dtype # float64
in_json = in_df.to_json(path_or_buf=tmp_file, orient='split', index=False)
out_df = pd.read_json(tmp_file, orient='split')
dec_column_type_out = out_df['Dec'].dtype # int64
CodePudding user response:
You can try to manually turn off the dtypes inferring
out_df = pd.read_json(tmp_file, orient='split', dtype=False)
print(out_df.dtypes)
Key object
Dec float64
Num int64
dtype: object