Home > Enterprise >  CSV to JSON convert using python
CSV to JSON convert using python

Time:10-15

Good afternoon, I don't have a background on python, and i tried some pre made code that is published on the internet and stack overflow but i don't get the result i want. here is my reference: https://www.geeksforgeeks.org/convert-csv-to-json-using-python. maybe someone can help me with a simple code, i want to convert this csv format

appname hostname id
backend testserver1 1
frontend testserver2 2
database testserver3 3

into a json format that looks like this

{
  
  "appname": ["backend","frontend","database"],
  "hostname": ["testserver1","testserver2","testserver3"],
  "id": ["1","2","3"]

}

What im currently using:

import csv
import json


# Function to convert a CSV to JSON
# Takes the file paths as arguments
def make_json(csvFilePath, jsonFilePath):

        # create a dictionary
        data = {}

        # Open a csv reader called DictReader
        with open(csvFilePath, encoding='utf-8') as csvf:
                csvReader = csv.DictReader(csvf)

                # Convert each row into a dictionary
                # and add it to data
                for rows in csvReader:

                        # Assuming a column named 'No' to
                        # be the primary key
                        key = rows['appname']
                        data[key] = rows

        # Open a json writer, and use the json.dumps()
        # function to dump data
        with open(jsonFilePath, 'w', encoding='utf-8') as jsonf:
                jsonf.write(json.dumps(data, indent=4))

# Driver Code

# Decide the two file paths according to your
# computer system
csvFilePath = r'Names.csv'
jsonFilePath = r'Names.json'

# Call the make_json function
make_json(csvFilePath, jsonFilePath)

output from the code:

{
    "backend": {
        "appname": "backend",
        "hostname": "testserver1",
        "ami_id": "1"
    },
    "frontend": {
        "appname": "frontend",
        "hostname": "testserver2",
        "ami_id": "2"
    },
    "database": {
        "appname": "database",
        "hostname": "testserver3",
        "ami_id": "3"
    }

The result is want:

{
  
  "appname": ["backend","frontend","database"],
  "hostname": ["testserver1","testserver2","testserver3"],
  "id": ["1","2","3"]

}

CodePudding user response:

Unless your input file is not realy huge I would suggest using pandas:

import pandas as pd
csv_df = pd.read_csv('stack_69581387.csv')

d= {'appname': csv_df.appname.tolist(),
    'hostname': csv_df.hostname.tolist(),
    'id': csv_df.id.tolist()}

If you want to have only unique values (it is not specified in your question what logic should be applied if you have redundancies in the data) you may use:

import pandas as pd
csv_df = pd.read_csv('stack_69581387.csv')

d = {'appname': csv_df.appname.unique().tolist(),
     'hostname': csv_df.hostname.unique().tolist(),
     'id': csv_df.id.unique().tolist()}

And, to write it:

import json
with open('data_69581387.json', 'w') as outfile:
    json.dump(d, outfile)

CodePudding user response:

If you print each dictionary during row in csvReader loop you'll see:

{'appname': 'backend', 'hostname': 'testserver1', 'id': '1'}
{'appname': 'frontend', 'hostname': 'testserver2', 'id': '2'}
{'appname': 'database', 'hostname': 'testserver3', 'id': '3'}

So you need to modify the loop to get desired behavior:

        # Open a csv reader called DictReader
        with open(csvFilePath, encoding='utf-8') as csvf:
            csvReader = csv.DictReader(csvf)

            # Convert each row into a dictionary
            # and add it to data
            for row in csvReader:
                for columnName in row:
                    if columnName not in data:
                        data[columnName] = []
                    data[columnName].append(row[columnName])

The output JSON file will look like:

{
    "appname": [
        "backend",
        "frontend",
        "database"
    ],
    "hostname": [
        "testserver1",
        "testserver2",
        "testserver2"
    ],
    "id": [
        "1",
        "2",
        "3"
    ]
}
  • Related