Home > Mobile >  merge json file with csv file to pandas
merge json file with csv file to pandas

Time:08-31

I have a json file like

{
    "type" : "FeatureCollection",
    "name" : "NBG_DATA.CBSWBI",
    "features" : [
        {
            "type" : "Feature",
            "geometry" : {
                "type" : "Polygon",
                "coordinates" : [
                    [
                        [ 5.9790846852, 51.8962916127 ],
                        [ 5.9787704282, 51.8970121894 ],
                        [ 5.9115581116, 51.9006415588 ], 
                        [ 5.9115375181, 51.9007482452 ], 
                        [ 5.911477843, 51.9011278526 ] 
                    ] 
                ] 
        }, 
    "properties" : { 
        "WK_CODE" : "WK170500", "WK_NAAM" : "Bemmel", "POPULATION" : "35000"
    }
}

with the field WK_CODE.

this json file i want to merge with a pandas df with the same WK_CODE

I want to merge the json and csv/df together so i can have count population per WK_CODE in the json file and then to a dataframe how can i do that?

CodePudding user response:

This is a smal part of the json file

{
            "type" : "Feature",
            "geometry" : {
                "type" : "Polygon",
                "coordinates" : [
                    [
                        [ 5.911477843, 51.9011278526 ],
                        [ 5.9114210585, 51.9014890654 ],
                        [ 5.9115581116, 51.9006415588 ],
                        [ 5.9115375181, 51.9007482452 ],
                        [ 5.911477843, 51.9011278526 ]
                    ]
                ]
            },
            "properties" : {
                "WK_CODE" : "WK170500"
                
            }
        }
    ]
}

CodePudding user response:

First you have to load your json from a file:

import json
with open('path/to/file.json') as fh:
    json_data = json.load(fh)

For our sake, let's say you loaded the data and it looks like this:

json_data =  { 
    "type" : "Feature", 
    "geometry" : {
        "type" : "Polygon", 
        "coordinates" : [ [ [ 5.911477843, 51.9011278526 ], [ 5.9114210585, 51.9014890654 ], [ 5.9115581116, 51.9006415588 ], [ 5.9115375181, 51.9007482452 ], [ 5.911477843, 51.9011278526 ] ] ] }, 
    "properties" : { 
        "WK_CODE" : "WK170500", "WK_NAAM" : "Bemmel", "POPULATION" : "35000"
    }
}

Let's create a main df and also a df_json based on this json_data:

>>> df = pd.DataFrame({'WK_CODE': ['WK170500', 'WK170501'], 'field2': [1,2]})
    WK_CODE WK_NAAM POPULATION
0   WK170500    Bemmel  35000

>>>df_json = pd.DataFrame.from_dict(json_data["properties"], orient="index").T
    WK_CODE field2
0   WK170500    1
1   WK170501    2

Next use pd.merge:

>>> pd.merge(df, df_json, on='WK_CODE', how='left')
    WK_CODE field2  WK_NAAM POPULATION
0   WK170500    1   Bemmel  35000
1   WK170501    2   NaN NaN
  • Related