Home > database >  Retrieve JSON from CSV using python in desired format
Retrieve JSON from CSV using python in desired format

Time:12-24

Many thanks in advance, I have been keen on collecting some insights on getting the below output from its corresponding input. Would like to see the table getting converted to a desirable format via using a python script as I have to work with a huge CSV at a later stage. Any inputs are highly appreciated.

Input CSV:

reference mcc value currency
10000 5300 134.09 USD
10001 5651 128.95 USD
10002 5912 104.71 USD

Used python code:

from csv import DictReader
from itertools import groupby
from pprint import pprint
import json

with open('Test_bulk_transactions_data.csv') as csvfile:
    r = DictReader(csvfile, skipinitialspace=True)
    data = [dict(d) for d in r]

    group = []
    uniquekeys = []

    for k, g in groupby(data, lambda r: (r['reference'], r['mcc'])):
        group.append({
            "reference": k[0],
            "mcc": k[1],
            "amount": [{k:v for k, v in d.items() if k not in ['reference','mcc']} for d in list(g)]})
        uniquekeys.append(k)

print(json.dumps(group, indent = 3)   '}')

Current Output:

  {
   "reference": "10000",
   "mcc": "5300",
   "amount": [
    {
     "value": "134.09",
     "currency": "USD"
    }
   ]
  },
  {
   "reference": "10001",
   "mcc": "5651",
   "amount": [
    {
     "value": "128.95",
     "currency": "USD"
    }
   ]
  },
  {
   "reference": "10002",
   "mcc": "5912",
   "amount": [ 
    {
     "value": "104.71",
     "currency": "USD"
    }
   ]
  }

Desired Output JSON:

  {
   "reference": "10000",
   "mcc": "5300",
   "amount": {
     "value": 134.09,
     "currency": "USD"
    }
  },
  {
   "reference": "10001",
   "mcc": "5651",
   "amount": {
     "value": 128.95,
     "currency": "USD"
    }
  },
  {
   "reference": "10002",
   "mcc": "5912",
   "amount": {
     "value": 104.71,
     "currency": "USD"
    }
  }
  • Important Note: Amount shouldn't fall into [] and that the value should stand as a numeric output and not strings.

CodePudding user response:

import csv

csv_filepath =  "/home/mhs/test.csv"
 
output = []


with open(csv_filepath) as cd:
    csvReader = csv.DictReader(cd)
    for r in csvReader:
        r["amount"] = {"value": float(r.pop("value")), "currency": r.pop("currency")}
        output.append(r)

CodePudding user response:

Without relying on imported modules, you could just do this:

J = []

with open('input.csv') as csv:
    cols = next(csv).split()
    assert len(cols) == 4
    for row in csv:
        t = row.split()
        if len(t) == 4:
            J.append({cols[0]: t[0], cols[1]: t[1], "amount": {cols[2]: float(t[2]), cols[3]: t[3]}})


print(J)
  • Related