I'm aware there are other threads on this topic, but I'm running into an issue that no other answer seems to address.
Given the following Pandas dataframe:
a b c d
a1 b1 c1 d1
a2 b2 c2 d2
I would like to create a nested JSON object with the following structure:
[
{
"a": "a1",
"b": "b1",
"nested_group":
{
"c": "c1",
"d": "d1"
}
},
{
"a": "a2",
"b": "b2",
"nested_group":
{
"c": "c2",
"d": "d2"
}
}
]
The solution I've found in other threads is to use the following code:
j = (df.groupby(['a','b'])
.apply(lambda x: x[['c','d']].to_dict('records'))
.reset_index()
.rename(columns={0:'nested_group'})
.to_json(orient='records'))
However, the issue I'm running into is that the above code places each nested_group
object inside square brackets, like so:
"nested_group": [
{
"c": "c2",
"d": "d2"
}
]
The object I'm trying to generate is intended to be the payload for an API call, and unfortunately the square brackets around each inner dictionary cause the API to return an error. (The outermost brackets at the very top/bottom of the object are fine.) How can I make Python NOT treat the inner dictionaries as lists?
Code to reproduce the example dataframe:
import numpy as np
import pandas as pd
array = np.array([['a1', 'b1', 'c1', 'd1'], ['a2', 'b2', 'c2', 'd2']])
df = pd.DataFrame(data=array, columns=['a','b','c','d'])
Thank you in advance :)
CodePudding user response:
Looking at the docs for to_dict
it seems like we still have to use the records
option, and if we assume it will always be a list of 1
element, just take the 0
th element using your original code
>>> import numpy as np
>>> import pandas as pd
>>> array = np.array([['a1', 'b1', 'c1', 'd1'], ['a2', 'b2', 'c2', 'd2']])
>>> df = pd.DataFrame(data=array, columns=['a','b','c','d'])
>>> (df.groupby(['a','b'])
.apply(lambda x: x[['c','d']].to_dict('records')[0])
.reset_index()
.rename(columns={0:'nested_group'})
.to_json(orient='records'))
'[{"a":"a1","b":"b1","nested_group":{"c":"c1","d":"d1"}},{"a":"a2","b":"b2","nested_group":{"c":"c2","d":"d2"}}]'
CodePudding user response:
Let us try
out = [{'a':x['a'],'b':x['b'],'nested_group':x[['c','d']].to_dict()} for _,x in df.iterrows() ]
Out[284]:
[{'a': 'a1', 'b': 'b1', 'nested_group': {'c': 'c1', 'd': 'd1'}},
{'a': 'a2', 'b': 'b2', 'nested_group': {'c': 'c2', 'd': 'd2'}}]