I would like to collect some insights on getting below output from it's corresponding input, I tried some code but couldn't get result I wanted. Would like to see the table getting converted to desirable format as I have to work with huge CSV in later stage. Any inputs are highly appreciated.
Input:
Reference | mcc | Value | currency |
---|---|---|---|
abcd1234 | 5300 | 134.09 | USD |
abcd1235 | 5411 | 38.48 | USD |
Code used:
from csv import DictReader
from itertools import groupby
from pprint import pprint
import json
with open('Test_bulk_transactions_data.csv') as csvfile:
r = DictReader(csvfile, skipinitialspace=True)
data = [dict(d) for d in r]
group = []
uniquekeys = []
for k, g in groupby(data, lambda r: (r['reference'], r['mcc'])):
group.append({
"reference": k[0],
"mcc": k[1],
"amount": [{k:v for k, v in d.items() if k not in ['reference','mcc']} for d in list(g)]})
uniquekeys.append(k)
print(json.dumps(group, indent = 3) '}')
Current Output:
[
{
"reference": "abcd1234",
"mcc": "5300",
"amount": [
{
"value": "134.09",
"currency": "USD"
}
]
},
{
"reference": "abcd1235",
"mcc": "5411",
"amount": [
{
"value": "38.48",
"currency": "USD"
}
]
}
]}
Desired Output:
{
"cardTransactions": [
{
"reference": "abcd1234",
"mcc": "5300",
"amount": {
"value": 134.09,
"currency": "USD"
}
},
{
"reference": "abcd1235",
"mcc": "5411",
"amount": {
"value": 38.48,
"currency": "USD"
}
}
]
}
CodePudding user response:
Looks like you just need to append everything into "cardTransactions" key, and the value can be cast to float when created.
"amount": [{k: float(v) for k, v in d.items() if k not in ['reference','mcc']} for d in list(g)]})
group = [] to group = defaultdict(list)
and group['cardTransactions'].append(... code as usual ...)
CodePudding user response:
your desired output will not allow for multiple amount/currency in a given transaction, so you don't need to use groupby at all.
The process could be concise as:
data = { 'cardTransactions':
[{ 'amount': {'value' : d.pop('value'),
'currency': d.pop('currency')},
**d }
for d in r ]}
print(json.dumps(data, indent = 3) '}')