Python - Vectorized operation to filter http response json-CodePudding

I have a list that basically comprises certain keys of data object received in HTTP response (json)

let the list be something like

 ['uid', 'status', 'profile.name', 'profile.login', 'profile.mobile']

I am parsing a nested json something like-

{
    "uid":"a1234",
    "status":active,
    "native":false,
    "profile":{
                 "name": "xyz",
                 "alias": "z",
                 "login": "abc",
              }

}

using pandas.json_normalise() on the above json will give me a dataframe like

        uid     status     native     profile.name     profile.alias     profile.login
    
   0    a1234   active     false      xyz              z                 abc

Now how do I use the above list and create a new list or dataframe (either is ok) to keep values from response according to the first list while handling any values not present in the http response json as NA or "". Expecting to get a list or dataframe like -

['a1234', 'active', 'xyz', 'abc', '']

        uid     status     profile.name     profile.login    profile.mobile
    
   0    a1234   active     xyz              abc              ""

currently I am looping over first list to achieve this but it is slower, any way I could use vectorized operations to speed this process up?

CodePudding user response：

IIUC, you can use reindex like this

print(pd.json_normalize(js) # js is my json
        .reindex(columns=['uid', 'status', 'profile.name', 
                          'profile.login', 'profile.mobile'])
        .fillna(''))
#      uid  status profile.name profile.login profile.mobile
# 0  a1234  active          xyz           abc