I am trying to convert a pandas dataframe into a nested json and am stuck. The Dataframe is containing Data in the following format:
description 69 70 project_id 60
1 Lorem Ipsum... 14 5679 CA
2 Lorem Ipsum... hom 15 2904 CA
3 Lorem Ipsum... im 14 5679 CA
What I want is a JSON that looks like this:
[ {
"issue": {
"project_id": 5679,
"description": "Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum.",
"custom_fields": [
{
"id": 69,
"value": null
},
{
"id": 60,
"value": "CA"
},
{
"id": 70,
"value": "14"
}
]
}
},
{
"issue": {
"project_id": 2904,
"description": "Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet ",
"custom_fields": [
{
"id": 69,
"value": "hom"
},
{
"id": 60,
"value": "CA"
},
{
"id": 70,
"value": "15"
}
]
}
},
This is what I have so far, but it doesn´t result in the needed format, what I don´t get for example are the headers of the JSON ("issue"...) and the nesting ("id": ..., "value": ...)
j = (df_test.groupby(['description','project_id'])
.apply(lambda x: x[['69', '70', '84', '60']].to_dict('records'))
.reset_index()
.rename(columns={0:'custom_fields'})
.to_json(orient='records'))
Any ideas on what I have to adjust? Thanks in advance on any help and tips
CodePudding user response:
Instead of putting it to one chained function call I would propose to simply loop over the data frame and process the information as needed:
import pandas as pd
import json
# create example data
df = pd.DataFrame({"description" : ['Lorem Ipsum', 'Lorem Ipsum', 'Lorem Ipsum'],
69: [None, "hom", "im"],
70: [14, 15, 14],
"project_id" : [5679, 2904, 5679],
60: ["CA", "CA", "CA"]
})
# convert data to desired format
res = []
custom_field_keys = [69, 60, 70]
for idx, row in df.iterrows():
entry = { key: value for key, value in row.items() if key not in custom_field_keys }
entry["custom_fields"] = [
{"id": key, "value": value} for key, value in row.items() if key in custom_field_keys
]
res.append({"issue": entry})
print(json.dumps(res, indent=2))
this yields:
[
{
"issue": {
"description": "Lorem Ipsum",
"project_id": 5679,
"custom_fields": [
{
"id": 69,
"value": null
},
{
"id": 70,
"value": 14
},
{
"id": 60,
"value": "CA"
}
]
}
},
{
"issue": {
"description": "Lorem Ipsum",
"project_id": 2904,
"custom_fields": [
{
"id": 69,
"value": "hom"
},
{
"id": 70,
"value": 15
},
{
"id": 60,
"value": "CA"
}
]
}
},
{
"issue": {
"description": "Lorem Ipsum",
"project_id": 5679,
"custom_fields": [
{
"id": 69,
"value": "im"
},
{
"id": 70,
"value": 14
},
{
"id": 60,
"value": "CA"
}
]
}
}
]