I have dataframe
that has "input" as the index of all of the rows. Thousands of them.
df1 =
index item name item quantity
input apple 4
input orange 3
input lemon 6
I need to turn it to a jsonl
file looking like this. I need it looking like this because of the requirement in Shopify GraphQL Admin API Bulk import.
{ "input": { "item name": "apple", "item quantity": "4"}}
{ "input": { "item name": "orange", "item quantity": "3"}}
{ "input": { "item name": "lemon", "item quantity": "6"}}
I can use df1.to_json(orient='records', lines=True)
to convert it into jsonl
without the index to look like this
{ "item name": "apple", "item quantity": "4"}}
{ "item name": "orange", "item quantity": "3"}}
{ "item name": "lemon", "item quantity": "6"}}
But i will need to add the key "input" to the front which i dont know how to do it. I am new to json.
I also tried df1.to_json(orient="index")
but it gives me an error ValueError: DataFrame index must be unique for orient='index'
indicating that every index must be unique.
Any help is appreciated. Cheers
CodePudding user response:
Your required output is:
{ "input": { "item name": "apple", "item quantity": "4"}}
{ "input": { "item name": "orange", "item quantity": "3"}}
{ "input": { "item name": "lemon", "item quantity": "6"}}
orient='index' won't work in this case because index are not unique. The above answer looks great. Your expected output is seperate JSONs.
This is a workaround, According to your required format incase if you need the same format itself
import pandas as pd
import json
data=[['apple',4],['orange','3'],['lemon',6]]
df=pd.DataFrame(data,columns=['item name','item quantity'])
json_as_str=df.to_json(orient="index")
json_value=json.loads(json_as_str)
string_formatted=[]
for key,val in json_value.items():
string_formatted.append("{'input':%s}" %val)
for i in string_formatted:
i=i.replace("'",'"')
print(i)
Output:
{"input":{"item name": "apple", "item quantity": 4}}
{"input":{"item name": "orange", "item quantity": 3}}
{"input":{"item name": "lemon", "item quantity": 6}}
CodePudding user response:
Assuming your dataframe to be like this:
In [93]: df
Out[93]:
index item_name item_quantity
0 input1 apple 4
1 input2 orange 3
2 input3 lemon 6
You can use df.set_index
with df.T
and to_dict
:
In [95]: df.set_index('index').T.to_dict()
Out[95]:
{'input1': {'item_name': 'apple', 'item_quantity': 4},
'input2': {'item_name': 'orange', 'item_quantity': 3},
'input3': {'item_name': 'lemon', 'item_quantity': 6}}
Or, if you want json
, do this:
In [96]: df.set_index('index').T.to_json()
Out[96]: '{"input1":{"item_name":"apple","item_quantity":4},"input2":{"item_name":"orange","item_quantity":3},"input3":{"item_name":"lemon","item_quantity":6}}'