I have dictionary
and JSON
files that I want to combine into a dataframe.
import json
import pandas as pd
jsonn = """
{
"Cake_make": {
"id": "25803",
"dday": "2009-01-01T15:00:00",
"cake_name": "chocolate",
"Mainmetrics": "7.6",
"TC": "10000.02",
"LSS":"102341.32",
"LTC":"12321.65"
},
"Jd": "2009-01-01T15:00:00"
}
"""
my_dic = {'Cake_make': {'id': 'matched', 'dday': 'matched', 'cake_name': 'Not Matched'}}
json_data = json.loads(jsonn)
I've found some examples regarding writing JSON files into CSV and merging JSON files but not the exact solution that I'm looking for.
CodePudding user response:
You can just add the fields manually:
import json
import pandas as pd
jsonn = """
{
"Cake_make": {
"id": "25803",
"dday": "2009-01-01T15:00:00",
"cake_name": "chocolate",
"Mainmetrics": "7.6",
"TC": "10000.02",
"LSS":"102341.32",
"LTC":"12321.65"
},
"Jd": "2009-01-01T15:00:00"
}
"""
my_dic = {'Cake_make': {'id': 'matched', 'dday': 'matched', 'cake_name': 'Not Matched'}}
json_data = json.loads(jsonn)
combined_dict = {
"id": json_data["Cake_make"]["id"],
"dday": json_data["Cake_make"]["dday"],
"Jd": json_data["Jd"],
"cake_name": json_data["Cake_make"]["cake_name"],
"dict": my_dict
}
dataframe = pd.DataFrame(data=combined_dict)
CodePudding user response:
As Ryan's answer shows, it is probably easiest to build the exact dictionary you want (combined_dict
) beforehand and simply pass that to DataFrame
, rather than trying to re-arrange the data after getting it into the df. You do not post the code you found on the internet that didn't do what you wanted, so I'm not sure what you were trying that didn't work.
Also, you are not merging JSON files, nor even JSON objects. You are actually working with python dictionary objects in the end. See here what json.loads actually does. It converts the JSON string into a dictionary you can traverse via keys. Read up about traversing python dictionary structures, there are many intuitive methods to make it easy (enumerating elements, looping through them, etc).
Edit: To attempt to answer your further comment, there are methods such as dict.keys() to dynamically read the keys (which you seem to be using as column names) from a dictionary object, and other helpful methods.
CodePudding user response:
There's always pd.json_normalize
...
df = pd.json_normalize(json_data)
df['dict'] = [my_dic]
print(df)
# Output:
Jd Cake_make.id Cake_make.dday Cake_make.cake_name Cake_make.Mainmetrics Cake_make.TC Cake_make.LSS Cake_make.LTC dict
0 2009-01-01T15:00:00 25803 2009-01-01T15:00:00 chocolate 7.6 10000.02 102341.32 12321.65 {'Cake_make': {'id': 'matched', 'dday': 'matched', 'cake_name': 'Not Matched'}}