I am running Python 3.8 and Pandas 0.19.2 and have a DataFrame which looks as follows:
id_number | name | amount | addenda |
---|---|---|---|
1234 | ABCD | $100 | Car-wash-$30 |
1234 | ABCD | $100 | Maintenance-$70 |
I need a dictionary/JSON like below:
[
{
'id_number': 1234,
'name': 'ABCD',
'amount': '$100',
'addenda': [
{'payment_related_info': 'Car-wash-$30'},
{'payment_related_info': 'Maintenance-$70'}
]
}
]
I tried using groupby and to_dict, but it did not work. Any suggestions? Thanks up front for the help.
CodePudding user response:
Just apply a groupby and aggregate by creating a dataframe inside like this:
data = {
"id_number": [1234, 1234],
"name": ["ABCD", "ABCD"],
"amount": ["$100", "$100"],
"addenda": ["Car-wash-$30", "Maintenance-$70"]
}
df = pd.DataFrame(data=data)
df.groupby(by=["id_number", "name", "amount"]) \
.agg(lambda col: pd.DataFrame(data=col) \
.rename(columns={"addenda": "payment_related_info"})) \
.reset_index() \
.to_json(orient="records")
This returns axactly the result you want!
CodePudding user response:
If we work backwards, you'll need your DataFrame to have the addenda information in a single row before using .to_dict
operation:
id_number | name | amount | addenda |
---|---|---|---|
1234 | ABCD | $100 | [{payment_related_info: Car-wash-$30, payment_related_info: Maintenance-$70}] |
To get here, you can perform a groupby
on id_number, name, amount
, then apply a function that collapses the strings in the addenda
rows for that groupby into a list of dictionaries where each key is the string 'payment_related_info'
.
This works as expected if you add more rows to your original df
as well:
id_number | name | amount | addenda |
---|---|---|---|
1234 | ABCD | $100 | Car-wash-$30 |
1234 | ABCD | $100 | Maintenance-$70 |
2345 | BCDE | $200 | Car-wash-$100 |
2345 | BCDE | $200 | Maintenance-$100 |
def collapse_row(x):
addenda_list = x["addenda"].to_list()
last_row = x.iloc[-1]
last_row["addenda"] = [{'payment_related_info':v} for v in addenda_list]
return last_row
grouped = df.groupby(["id_number","name","amount"]).apply(collapse_row).reset_index(drop=True)
grouped.to_dict(orient='records')
Result:
[
{
"id_number":1234,
"name":"ABCD",
"amount":"$100",
"addenda":[
{"payment_related_info":"Car-wash-$30"},
{"payment_related_info":"Maintenance-$70"}
]
},
{
"id_number":2345,
"name":"BCDE",
"amount":"$200",
"addenda":[
{"payment_related_info":"Car-wash-$100"},
{"payment_related_info":"Maintenance-$100"}
]
}
]