Home > OS >  add dictionary data to a pandas pandas dataframe overwriting values not loosing index
add dictionary data to a pandas pandas dataframe overwriting values not loosing index

Time:09-06

Assuming I have a DF as follows:

df = pd.DataFrame({'legs': [2, 4, 8, 0],
                   'wings': [2, 0, 0, 0],
                   'specimen': [10, 2, 1, 8]},
                  index=['falcon', 'dog', 'spider', 'fish'])
df

resulting in:

enter image description here

Now I get data in the form of a dict and I would like to add it

new_data = {'dog':{'wings':45,'specimen':89},'fish':{'wings':55555,'something_new':'new value'}, 'new_row':{'wings':90}}
new_data_df = pd.DataFrame(new_data).T
new_data_df

I can use append to add the data to the first DF, but append will be deprecated, so I rather stay away. I can use concat as in here: enter image description here

I dont want row index to be duplicated. I would like that the data is overwriting and added when a new column or row appears in the dict. There should be one and only one dog index column. As you see in the above screenshot the row dog appears two times.

changing ignore_index=False to True does not help, the index simple is skipped.

CodePudding user response:

You may check with combine_first

out = new_data_df.combine_first(df)
Out[144]: 
         legs something_new specimen  wings
dog       4.0           NaN     89.0   45.0
falcon    2.0           NaN     10.0    2.0
fish      0.0     new value      8.0  55555
new_row   NaN           NaN      NaN   90.0
spider    8.0           NaN      1.0    0.0

CodePudding user response:

Another option in case you want to keep values from both rows:

df = pd.DataFrame({'legs': [2, 4, 8, 0],
                   'wings': [2, 0, 0, 0],
                   'specimen': [10, 2, 1, 8]},
                  index=['falcon', 'dog', 'spider', 'fish'])

new_data = {'dog':{'wings':45,'specimen':89},'fish':{'wings':55555,'something_new':'new value'}, 'new_row':{'wings':90}}

new_data_df = pd.DataFrame(new_data).T
output = pd.concat([df, new_data_df], ignore_index=False).reset_index()

output1 = output.groupby('index').agg(list)

print(output1)

legs       wings   specimen     something_new
index                                                       
dog      [4.0, nan]   [0, 45.0]  [2, 89.0]        [nan, nan]
falcon        [2.0]         [2]       [10]             [nan]
fish     [0.0, nan]  [0, 55555]   [8, nan]  [nan, new value]
new_row       [nan]      [90.0]      [nan]             [nan]
spider        [8.0]         [0]        [1]             [nan]
  • Related