Output text file content to JSON Python-CodePudding

I would like to output the following text file to the JSON format displayed below using Python.

I am not sure where to begin, but I hope that my problem is simple to understand.

input.txt

Name 1, FeatureServer, thisIsALink, layer 1, layer 2
Name 2, FeatureServer, thisIsALink, layer 1
Name 3, MapServer, thisIsALink, layer 1, layer 2
Name 4, FeatureServer, thisIsALink, layer 1, layer 2
Name 5, FeatureServer, thisIsALink, layer 1, layer 2, layer 3

output.json

{
    "Name 1": {
        "type": "FeatureServer",
        "url": "thisIsALink",
        "layer(s)": {
            "1": "layer 1",
            "2": "layer 2"
        }
    },
    "Name 2": {
        "type": "FeatureServer",
        "url": "thisIsALink",
        "layer(s)": {
            "1": "layer 1"
        }
    },
    "Name 3": {
        "type": "FeatureServer",
        "url": "thisIsALink",
        "layer(s)": {
            "1": "layer 1",
            "2": "layer 2"
        }
    },
    "Name 4": {
        "type": "FeatureServer",
        "url": "thisIsALink",
        "layer(s)": {
            "1": "layer 1",
            "2": "layer 2"
        }
    },
    "Name 5": {
        "type": "FeatureServer",
        "url": "thisIsALink",
        "layer(s)": {
            "1": "layer 1",
            "2": "layer 2",
            "3": "layer 3"
        }
    }
}

I have tried following a tutorial from GeeksforGeeks

But I haven't quite figured out how to make it work for my situation.

CodePudding user response：

Here is a small script that does just that:

import json

output = {}
with open("input.txt", "r") as f:
    for line in f:
        # Split the line, preserve all layers
        name, server, url, *layers = line.rstrip().split(", ")
        output[name] = {
            "type": server,
            "url": url,
            "layer(s)": {str(i): layer for i, layer in enumerate(layers, 1)}
        }

with open("output.json", "w") as f:
    json.dump(output, f, indent=4)

CodePudding user response：

If we have that sample text, we can split it into lines.

text = """Name 1, FeatureServer, thisIsALink, layer 1, layer 2
Name 2, FeatureServer, thisIsALink, layer 1
Name 3, MapServer, thisIsALink, layer 1, layer 2
Name 4, FeatureServer, thisIsALink, layer 1, layer 2
Name 5, FeatureServer, thisIsALink, layer 1, layer 2, layer 3"""

lines = text.split('\n')

Now we can import the re module and use it to split each line.

import re

data = [re.split(r'\s*,\s*', line) for line in lines]
# [['Name 1', 'FeatureServer', 'thisIsALink', 'layer 1', 'layer 2'], 
#  ['Name 2', 'FeatureServer', 'thisIsALink', 'layer 1'],
#  ['Name 3', 'MapServer', 'thisIsALink', 'layer 1', 'layer 2'],
#  ['Name 4', 'FeatureServer', 'thisIsALink', 'layer 1', 'layer 2'],
#  ['Name 5', 'FeatureServer', 'thisIsALink', 'layer 1', 'layer 2', 'layer 3']]

Then a few dictionary comprehensions can build up the dictionary you're looking for.

d = {
  x[0]: {
    'type': x[1], 
    'url': x[2], 
    'layer(s)': {
      z: y  
      for y in x[3:] 
      for _, z in (y.split(' ', 1),)
    }
  } 
  for x in data
}
# {'Name 1': {'type': 'FeatureServer', 'url': 'thisIsALink', 'layer(s)': {'1': 'layer 1', '2': 'layer 2'}}, 
#  'Name 2': {'type': 'FeatureServer', 'url': 'thisIsALink', 'layer(s)': {'1': 'layer 1'}},
#  'Name 3': {'type': 'MapServer', 'url': 'thisIsALink', 'layer(s)': {'1': 'layer 1', '2': 'layer 2'}}, 
#  'Name 4': {'type': 'FeatureServer', 'url': 'thisIsALink', 'layer(s)': {'1': 'layer 1', '2': 'layer 2'}}, 
#  'Name 5': {'type': 'FeatureServer', 'url': 'thisIsALink', 'layer(s)': {'1': 'layer 1', '2': 'layer 2', '3': 'layer 3'}}}

Then you just need to dump that dictionary to JSON.

CodePudding user response：

Here is one way to do it, based on the infromation you've provided:

import json

def main():
    file = './74724498/input.txt'
    output_dict = {}
    with open(file) as f:
        for line in f:
            # Strip the newline character and split the line into a list
            line_ls = line.strip().split(", ")
            # Create a dictionary for each row:
            row_dict = {
                line_ls[0]: {
                    "type": line_ls[1],
                    "url": line_ls[2],
                    # Create a dictionary for each layer:
                    "layer(s)": {str(index): val for index, val in enumerate(line_ls[3:], start=1)}
                    }
                }
            # Update the output dictionary with the row dictionary:
            output_dict.update(row_dict)
        
        # Write the output dictionary to a json file, indenting the output for readability:
        json_string = json.dumps(output_dict, indent=4)
        # Write the json string to a file:
        with open('json_data.json', 'w') as outfile:
            outfile.write(json_string)    
        
if __name__ == '__main__':
    main()

Play with it, see if that is what you're looking for!

CodePudding user response：

The problem is that your "layers" need to be nested in json, and there's no way to represent that in csv. If it is implied that columns 4, 5 (and beyond) should be nested, you could address that programmatically.

I assume this is the tutorial you are referring to :

https://www.geeksforgeeks.org/convert-csv-to-json-using-python/