I got a pandas dataFrame with 2 columns that looks like that:
data = [[30222, 5], [10211, 2], [30333, 3]]
df = pd.DataFrame(data, columns=['id', 'weight'])
and I wish to have it as a json file of the following form:
[
{"id":30222,"weight":5},
{"id":10211,"weight":2},
{"id":30333,"weight":3}
]
later, I would like to store it in HDFS (so I create a folder with the json file in it, which I will later upload).
what I tried to do is:
my_dict = my_df.set_index('id')['weight'].to_dict()
with TemporaryDirectory() as tempdir:
temp_local_path = os.path.join(tempdir, "my_json")
if boosting_nha_properties_dict is not None:
with open(os.path.join(temp_local_path, "my_json.json"), "w") as f:
json.dump(my_dict, f)
but it doesn't look the same format as I need the json, and also, it throws an exception:
"FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp36sdlarw/my_json/my_json.json'"
Thanks!
CodePudding user response:
You can just try
d = df.to_dict(orient = 'records')
Out[210]:
[{'id': 30222, 'weight': 5},
{'id': 10211, 'weight': 2},
{'id': 30333, 'weight': 3}]
CodePudding user response:
There is no need to make a dictionary - you can export directly using to_json
. You need to make the "my_json" directory first.
with TemporaryDirectory() as tempdir:
temp_local_path = os.path.join(tempdir, "my_json")
os.mkdir(temp_local_path)
if boosting_nha_properties_dict is not None:
df.to_json(os.path.join(temp_local_path, "my_json.json"), orient="records")