Home > OS >  Output text file content to JSON Python
Output text file content to JSON Python

Time:12-08

I would like to output the following text file to the JSON format displayed below using Python.

I am not sure where to begin, but I hope that my problem is simple to understand.

input.txt

Name 1, FeatureServer, thisIsALink, layer 1, layer 2
Name 2, FeatureServer, thisIsALink, layer 1
Name 3, MapServer, thisIsALink, layer 1, layer 2
Name 4, FeatureServer, thisIsALink, layer 1, layer 2
Name 5, FeatureServer, thisIsALink, layer 1, layer 2, layer 3

output.json

{
    "Name 1": {
        "type": "FeatureServer",
        "url": "thisIsALink",
        "layer(s)": {
            "1": "layer 1",
            "2": "layer 2"
        }
    },
    "Name 2": {
        "type": "FeatureServer",
        "url": "thisIsALink",
        "layer(s)": {
            "1": "layer 1"
        }
    },
    "Name 3": {
        "type": "FeatureServer",
        "url": "thisIsALink",
        "layer(s)": {
            "1": "layer 1",
            "2": "layer 2"
        }
    },
    "Name 4": {
        "type": "FeatureServer",
        "url": "thisIsALink",
        "layer(s)": {
            "1": "layer 1",
            "2": "layer 2"
        }
    },
    "Name 5": {
        "type": "FeatureServer",
        "url": "thisIsALink",
        "layer(s)": {
            "1": "layer 1",
            "2": "layer 2",
            "3": "layer 3"
        }
    }
}

I have tried following a tutorial from GeeksforGeeks

But I haven't quite figured out how to make it work for my situation.

CodePudding user response:

Here is a small script that does just that:

import json

output = {}
with open("input.txt", "r") as f:
    for line in f:
        # Split the line, preserve all layers
        name, server, url, *layers = line.rstrip().split(", ")
        output[name] = {
            "type": server,
            "url": url,
            "layer(s)": {str(i): layer for i, layer in enumerate(layers, 1)}
        }

with open("output.json", "w") as f:
    json.dump(output, f, indent=4)

CodePudding user response:

If we have that sample text, we can split it into lines.

text = """Name 1, FeatureServer, thisIsALink, layer 1, layer 2
Name 2, FeatureServer, thisIsALink, layer 1
Name 3, MapServer, thisIsALink, layer 1, layer 2
Name 4, FeatureServer, thisIsALink, layer 1, layer 2
Name 5, FeatureServer, thisIsALink, layer 1, layer 2, layer 3"""

lines = text.split('\n')

Now we can import the re module and use it to split each line.

import re

data = [re.split(r'\s*,\s*', line) for line in lines]
# [['Name 1', 'FeatureServer', 'thisIsALink', 'layer 1', 'layer 2'], 
#  ['Name 2', 'FeatureServer', 'thisIsALink', 'layer 1'],
#  ['Name 3', 'MapServer', 'thisIsALink', 'layer 1', 'layer 2'],
#  ['Name 4', 'FeatureServer', 'thisIsALink', 'layer 1', 'layer 2'],
#  ['Name 5', 'FeatureServer', 'thisIsALink', 'layer 1', 'layer 2', 'layer 3']]

Then a few dictionary comprehensions can build up the dictionary you're looking for.

d = {
  x[0]: {
    'type': x[1], 
    'url': x[2], 
    'layer(s)': {
      z: y  
      for y in x[3:] 
      for _, z in (y.split(' ', 1),)
    }
  } 
  for x in data
}
# {'Name 1': {'type': 'FeatureServer', 'url': 'thisIsALink', 'layer(s)': {'1': 'layer 1', '2': 'layer 2'}}, 
#  'Name 2': {'type': 'FeatureServer', 'url': 'thisIsALink', 'layer(s)': {'1': 'layer 1'}},
#  'Name 3': {'type': 'MapServer', 'url': 'thisIsALink', 'layer(s)': {'1': 'layer 1', '2': 'layer 2'}}, 
#  'Name 4': {'type': 'FeatureServer', 'url': 'thisIsALink', 'layer(s)': {'1': 'layer 1', '2': 'layer 2'}}, 
#  'Name 5': {'type': 'FeatureServer', 'url': 'thisIsALink', 'layer(s)': {'1': 'layer 1', '2': 'layer 2', '3': 'layer 3'}}}

Then you just need to dump that dictionary to JSON.

CodePudding user response:

Here is one way to do it, based on the infromation you've provided:

import json

def main():
    file = './74724498/input.txt'
    output_dict = {}
    with open(file) as f:
        for line in f:
            # Strip the newline character and split the line into a list
            line_ls = line.strip().split(", ")
            # Create a dictionary for each row:
            row_dict = {
                line_ls[0]: {
                    "type": line_ls[1],
                    "url": line_ls[2],
                    # Create a dictionary for each layer:
                    "layer(s)": {str(index): val for index, val in enumerate(line_ls[3:], start=1)}
                    }
                }
            # Update the output dictionary with the row dictionary:
            output_dict.update(row_dict)
        
        # Write the output dictionary to a json file, indenting the output for readability:
        json_string = json.dumps(output_dict, indent=4)
        # Write the json string to a file:
        with open('json_data.json', 'w') as outfile:
            outfile.write(json_string)    
        
if __name__ == '__main__':
    main()

Play with it, see if that is what you're looking for!

CodePudding user response:

The problem is that your "layers" need to be nested in json, and there's no way to represent that in csv. If it is implied that columns 4, 5 (and beyond) should be nested, you could address that programmatically.

I assume this is the tutorial you are referring to :

https://www.geeksforgeeks.org/convert-csv-to-json-using-python/

  • Related