Home > Mobile >  How to transform a JSON file that has 1 'header' row followed by 'data' rows? -
How to transform a JSON file that has 1 'header' row followed by 'data' rows? -

Time:09-23

I have a JSON file, like this:

{
   "data" : [
     { "values" : [ "ColumnHeader1", "ColumnHeader2", "ColumnHeader3" ]},
     { "values" : [ "Row1Column1", "Row1Column2", "Row1Column3" ]},
     { "values" : [ "Row2Column1", "Row2Column2", "Row2Column3" ]}
  ]
}

I want to transform it, to be like this:

{
   data: [
     { "ColumnHeader1" : "Row1Value1", "ColumnHeader2": "Row1Value2", "ColumnHeader3" : "Row1Value3" },
     { "ColumnHeader1" : "Row2Value1", "ColumnHeader2": "Row2Value2", "ColumnHeader3" : "Row2Value3" }
  ]
}

I did write a Python script for that - but I wonder could something clever be done via jq or pandas ? (or some other Unix tool or Python library...)

CodePudding user response:

A jq-only solution:

def objectify($header): 
  . as $in
  | reduce range(0; $header|length) as $i ({}; .[$header[$i]] = $in[$i] );

.data[0].values as $header
| .data |= (.[1:] | map(.values | objectify($header)) )

If you like nifty:

def objectify($header): with_entries(.key |= $header[.]) ;

So, if you want a two-liner:

.data[0].values as $header
| .data |= (.[1:] | map(.values | with_entries(.key |= $header[.])))

CodePudding user response:

(not very elegant) Answer

"""
Read a JSON file where the 1st item in array is a set of headers, and the other items are values.
Outputs a JSON file where the other items in array are transposed to use those headers.
"""
import json
from sys import argv

input_json=argv[1]
output_json=argv[2]

data = None
with open(input_json, "r") as infile:
    # returns JSON object as a dictionary
    data = json.load(infile)

headings = data["data"][0]['values']

new_rows = []
rows = len(data["data"])
for r in range(1, rows):
    row = data["data"][r]['values']
    new_row = dict()
    new_rows.append(new_row)
    for h in range(0, len(headings)):
        new_row[headings[h]] = row[h]

new_data = dict()
new_data["data"] = new_rows

with open(output_json, "w") as outfile:
    json_object = json.dumps(new_data, indent=2)
    outfile.write(json_object)

Hoping there is a better way with less code :)

CodePudding user response:

A one liner:

d = {
   "data" : [
     { "values" : [ "ColumnHeader1", "ColumnHeader2", "ColumnHeader3" ]},
     { "values" : [ "Row1Column1", "Row1Column2", "Row1Column3" ]},
     { "values" : [ "Row2Column1", "Row2Column2", "Row2Column3" ]}
  ]
}


d = {"data": [{k: v for k, v in zip(d["data"][0]["values"], row["values"])} for row in d["data"][1:]]}

Outputs:

{'data': [{'ColumnHeader1': 'Row1Column1', 'ColumnHeader2': 'Row1Column2', 'ColumnHeader3': 'Row1Column3'}, {'ColumnHeader1': 'Row2Column1', 'ColumnHeader2': 'Row2Column2', 'ColumnHeader3': 'Row2Column3'}]}

CodePudding user response:

You don't need to iterate all headings. I hope it helps.

data = {
    "data": [
        {"values": ["ColumnHeader1", "ColumnHeader2", "ColumnHeader3"]},
        {"values": ["Row1Column1", "Row1Column2", "Row1Column3"]},
        {"values": ["Row2Column1", "Row2Column2", "Row2Column3"]},
    ]
}

data = data["data"]
headings = data[0]["values"]
rows = data[1:]
new_data = {"data":[dict(zip(headings, row["values"])) for row in rows]}

CodePudding user response:

Here is a different jq solution without reduce:

.data |= (
  map(.values)
  | first as $headers | del(first)
  | map(
    [ $headers, .]
    | transpose
    | map({(first): last})
    | add
  )
)

Output:

{
  "data": [
    {
      "ColumnHeader1": "Row1Column1",
      "ColumnHeader2": "Row1Column2",
      "ColumnHeader3": "Row1Column3"
    },
    {
      "ColumnHeader1": "Row2Column1",
      "ColumnHeader2": "Row2Column2",
      "ColumnHeader3": "Row2Column3"
    }
  ]
}

Or to rebuild the result object from scratch:

{
  data: (
    .data | map(.values)
    | first as $headers | del(first) 
    | map(
      [ $headers, .]
      | transpose
      | map({(first): last})
      | add
    )
  )
}

first as $headers could be rewritten as . as [$headers] or .[0] as $headers. del(first) could be replaced with .[1:].

CodePudding user response:

Here's a solution for jq using transpose and map:

.data |= (map(.values) | transpose
  | map([{(.[0]): .[1:][]}]) | transpose
  | map(add)
)
{
  "data": [
    {
      "ColumnHeader1": "Row1Column1",
      "ColumnHeader2": "Row1Column2",
      "ColumnHeader3": "Row1Column3"
    },
    {
      "ColumnHeader1": "Row2Column1",
      "ColumnHeader2": "Row2Column2",
      "ColumnHeader3": "Row2Column3"
    }
  ]
}

Demo

  • Related