Home > Back-end >  Add missing keys in Dataframe so that all dicts in Dataframe have same keys
Add missing keys in Dataframe so that all dicts in Dataframe have same keys

Time:02-24

I have a list of dicts Where one dict may have more keys than other e.g.

[
  {
    "First Name": "qwert",
    "Last name": "fgh",
    "Status": "status",
    "18/02/22": "4.11",
    "20/02/22": "17.13"
  },
  {
    "First Name": "updated",
    "Last name": "fgh",
    "Status": "status",
    "19/02/22": "14.0",
    "22/02/22": "8.48"
  }
]

You see above dicts have Date as keys. FIrst, dict don't have 19/02/22 as well as 22/02/22 and the Second dict don't have 18/02/22 & 20/02/22 . Now Is there any way in pandas so That I can add missing keys with empty values so that all dicts have the same keys like

[
  {
    "First Name": "qwert",
    "Last name": "fgh",
    "Status": "status",
    "18/02/22": "4.11",
    "19/02/22": "",
    "20/02/22": "17.13",
    "22/02/22": ""
  },
  {
    "First Name": "updated",
    "Last name": "fgh",
    "Status": "status",
    "18/02/22": "",
    "19/02/22": "14.0",
    "20/02/22": "",
    "22/02/22": "8.48"
  }
]

Basically, I am converting this dict into excel file using Pandas and if all dict don't have sam ekeys then resultant excel don't have dates col in order. It looks something like enter image description here

You see 19/02/22 comes after 20/02/22 since first dict don't have 19/02/22.

So it can be a large Dataset and basically, I want excel file have All Dates in order

CodePudding user response:

First convert to DataFrame:

df = pd.DataFrame(L)
print (df)
  First Name Last name  Status 18/02/22 20/02/22 19/02/22 22/02/22
0      qwert       fgh  status     4.11    17.13      NaN      NaN
1    updated       fgh  status      NaN      NaN     14.0     8.48
    

And then sorting datetimes by DataFrame.sort_index with key parameter and convert values to datetimes by to_datetime:

f = lambda x: pd.to_datetime(x, dayfirst=True, errors='coerce')
df = df.fillna('').sort_index(axis=1, key=f, na_position='first')
print (df)
  First Name Last name  Status 18/02/22 19/02/22 20/02/22 22/02/22
0      qwert       fgh  status     4.11             17.13         
1    updated       fgh  status              14.0              8.48


L = df1.to_dict(orient='records')
print (L)
[{'First Name': 'qwert',
  'Last name': 'fgh', 
  'Status': 'status', 
  '18/02/22': '4.11',
  '19/02/22': '',
  '20/02/22': '17.13', 
  '22/02/22': ''},
 {'First Name': 'updated', 
  'Last name': 'fgh',
  'Status': 'status', 
  '18/02/22': '', 
  '19/02/22': '14.0',
  '20/02/22': '', 
  '22/02/22': '8.48'}]
    
  • Related