Good afternoon, I don't have a background on python, and i tried some pre made code that is published on the internet and stack overflow but i don't get the result i want. here is my reference: https://www.geeksforgeeks.org/convert-csv-to-json-using-python. maybe someone can help me with a simple code, i want to convert this csv format
appname | hostname | id |
---|---|---|
backend | testserver1 | 1 |
frontend | testserver2 | 2 |
database | testserver3 | 3 |
into a json format that looks like this
{
"appname": ["backend","frontend","database"],
"hostname": ["testserver1","testserver2","testserver3"],
"id": ["1","2","3"]
}
What im currently using:
import csv
import json
# Function to convert a CSV to JSON
# Takes the file paths as arguments
def make_json(csvFilePath, jsonFilePath):
# create a dictionary
data = {}
# Open a csv reader called DictReader
with open(csvFilePath, encoding='utf-8') as csvf:
csvReader = csv.DictReader(csvf)
# Convert each row into a dictionary
# and add it to data
for rows in csvReader:
# Assuming a column named 'No' to
# be the primary key
key = rows['appname']
data[key] = rows
# Open a json writer, and use the json.dumps()
# function to dump data
with open(jsonFilePath, 'w', encoding='utf-8') as jsonf:
jsonf.write(json.dumps(data, indent=4))
# Driver Code
# Decide the two file paths according to your
# computer system
csvFilePath = r'Names.csv'
jsonFilePath = r'Names.json'
# Call the make_json function
make_json(csvFilePath, jsonFilePath)
output from the code:
{
"backend": {
"appname": "backend",
"hostname": "testserver1",
"ami_id": "1"
},
"frontend": {
"appname": "frontend",
"hostname": "testserver2",
"ami_id": "2"
},
"database": {
"appname": "database",
"hostname": "testserver3",
"ami_id": "3"
}
The result is want:
{
"appname": ["backend","frontend","database"],
"hostname": ["testserver1","testserver2","testserver3"],
"id": ["1","2","3"]
}
CodePudding user response:
Unless your input file is not realy huge I would suggest using pandas:
import pandas as pd
csv_df = pd.read_csv('stack_69581387.csv')
d= {'appname': csv_df.appname.tolist(),
'hostname': csv_df.hostname.tolist(),
'id': csv_df.id.tolist()}
If you want to have only unique values (it is not specified in your question what logic should be applied if you have redundancies in the data) you may use:
import pandas as pd
csv_df = pd.read_csv('stack_69581387.csv')
d = {'appname': csv_df.appname.unique().tolist(),
'hostname': csv_df.hostname.unique().tolist(),
'id': csv_df.id.unique().tolist()}
And, to write it:
import json
with open('data_69581387.json', 'w') as outfile:
json.dump(d, outfile)
CodePudding user response:
If you print each dictionary during row in csvReader
loop you'll see:
{'appname': 'backend', 'hostname': 'testserver1', 'id': '1'}
{'appname': 'frontend', 'hostname': 'testserver2', 'id': '2'}
{'appname': 'database', 'hostname': 'testserver3', 'id': '3'}
So you need to modify the loop to get desired behavior:
# Open a csv reader called DictReader
with open(csvFilePath, encoding='utf-8') as csvf:
csvReader = csv.DictReader(csvf)
# Convert each row into a dictionary
# and add it to data
for row in csvReader:
for columnName in row:
if columnName not in data:
data[columnName] = []
data[columnName].append(row[columnName])
The output JSON file will look like:
{
"appname": [
"backend",
"frontend",
"database"
],
"hostname": [
"testserver1",
"testserver2",
"testserver2"
],
"id": [
"1",
"2",
"3"
]
}