i have some local json files. For example:
JSON 1
{
"events":{
"id1":{
"name":"Marcus",
"surname":"Redwhite",
"age":"22",
"text":{
"description":"Some description ...",
"title":"title of description"
}
},
"id2":{
"name":"Fred",
"surname":"Rose",
"age":"30",
"text":{
"description":"Some description ...",
"title":"title of description"
}
}
}
JSON 1 Modified
{
"events":{
"id1":{
"name":"Marcus Modified",
"surname":"Redwhite Modified",
"age":"22",
"text":{
"description":"Some description ...",
"title":"title of description Modified"
}
},
"id2":{
"name":"Fred",
"surname":"Rose Modified",
"age":"50",
"text":{
"description":"Some description ... Modified",
"title":"title of description"
}
}
}
I have to compare these Json files (in this example name field, surname field, age field and text field were modified) and i have to calculate the percentage of difference between them (drawing a pie chart or any other graph) . Is there a way to do it?
import json
import glob
# get list of All json files in different folders:
originalJsonFilesList = glob.glob("C:/Python/OriginalJson/*.json")
modifiedJsonFilesList = glob.glob("C:/Python/ModifiedJson/*.json")
# Loop all list
for originalfile, modifiedFile in originalJsonFilesList, modifiedJsonFilesList:
# Opening JSON files (original and modified)
originalJson = open(originalfile)
modifiedJson = open(modifiedFile)
# load as dictionary
data1 = json.load(originalJson)
data2 = json.load(modifiedJson)
#############################################
# Something for calculateing difference
# of percentage between data1 and data2
############################################
# Closing files
originalJson.close()
modifiedJson.close()
CodePudding user response:
first :
convert your json to dict
dict_json1 = json.loads(json_1)
dict_json_modified = json.loads(json_modified)
second :
convert them to set:
dict_json1 = set(dict_json1.items())
dict_json_modified = set(dict_json_modified.items())
diff = dict_json1 ^ dict_json_modified
print (diff)
CodePudding user response:
You need to implement an algorithm called Levenshtein Distance Metric
There's another question that is similar to your situation with an alternative solution, you can take a look at it.
You also can check SequenceMatcher from difflib