I have the following data frame"
Code to create the dataframe:
df = pd.DataFrame( {'month_i': {0: '2022-01-31', 1: '2022-02-28', 2: '2022-03-31', 3: '2022-04-30', 4: '2022-01-31', 5: '2022-02-28', 6: '2022-03-31', 7: '2022-04-30'}, 'id': {0: 'ACT', 1: 'ACT', 2: 'ACT', 3: 'ACT', 4: 'ACT', 5: 'ACT', 6: 'ACT', 7: 'ACT'}, 'city_name': {0: 'New York', 1: 'New York', 2: 'New York', 3: 'New York', 4: 'New York', 5: 'New York', 6: 'New York', 7: 'New York'}, 'lineColor': {0: '#4F63E7', 1: '#4F63E7', 2: '#4F63E7', 3: '#4F63E7', 4: '#4F63E7', 5: '#4F63E7', 6: '#4F63E7', 7: '#4F63E7'}, 'ptype': {0: 'house', 1: 'house', 2: 'house', 3: 'house', 4: 'unit', 5: 'unit', 6: 'unit', 7: 'unit'}, 'som': {0: 171.0, 1: 483.0, 2: 478.0, 3: 465.0, 4: 107.0, 5: 250.0, 6: 268.0, 7: 248.0}, 'dom': {0: 25, 1: 30, 2: 24, 3: 24, 4: 53, 5: 51, 6: 48, 7: 37}} )
df['month_i'] = pd.to_datetime(df['month_i'])
which needs to be presented as a JSON in this format
{
"house": {
"som": [
171,
483,
478,
465
],
"dom": [
25,
30,
24,
24
]
},
"unit": {
"som": [
107,
250,
268,
248
],
"dom": [
53,
51,
48,
37
]
},
"ref": {
"months": [
"2022-01-31",
"2022-02-28",
"2022-03-31",
"2022-04-30"
],
"lineColor": "#4F63E7",
"city_name": "New York"
}
}
I was hoping to get to a solution where the json can be produced in pandas with dict or a nested loop. But can only get as far as grouping by ptype. Any help would be greatly appreciated!
P.S> lineColor can be hardcoded to "#4F63E7" and does not need to be extracted from the dataframe
CodePudding user response:
You can try creating two different dictionaries: one with the main data and the other with the reference data and then merging them:
import json
#pivot the dataframe with the main data
pivoted = df.pivot("month_i","ptype",["som","dom"]).T
#groupby and create the first dictionary in the necessary format
data = pivoted.groupby(level=1).apply(lambda x: x.droplevel(1).apply(list,axis=1).to_dict()).to_dict()
#create the second dictionary with the reference data
other = {"ref": {"months": list(pivoted.columns),
"lineColor": df["lineColor"].iat[0],
"city_name":df["city_name"].iat[0]}}
#merge the dictionaries
merged = data|other
#convert to json
output = json.dumps(merged)
>>> output
{
"house": {
"som": [
171,
483,
478,
465
],
"dom": [
25,
30,
24,
24
]
},
"unit": {
"som": [
107,
250,
268,
248
],
"dom": [
53,
51,
48,
37
]
},
"ref": {
"months": [
"2022-01-31",
"2022-02-28",
"2022-03-31",
"2022-04-30"
],
"lineColor": "#4F63E7",
"city_name": "New York"
}
}
CodePudding user response:
Another approch:
_dict = {}
m_key = []
s_key = ['som', 'dom']
for value in df['ptype'].unique():
frame = df[df['ptype'] == value]
m_key.append(value)
for m in m_key:
_dict[m] = {}
for s in s_key:
_dict[m][s] = []
for ptype in df['ptype'].unique():
frame = df[df['ptype'] == ptype]
_dict[m][s].append(frame['som'].to_list())
_dict[m][s].append(frame['dom'].to_list())
break
_dict['ref'] = {}
_dict['ref']['months'] = df['month_i'].unique().tolist()
_dict['ref']['lineColor'] = '#4F63E7'
print(json.dumps(_dict))