How can i add datas from another data, but without removing NaN values? I have three data similar to this
df_main = df_main = pd.DataFrame({'ID': ['10', '11', '12', '13', '14', '15', '16'], 'Name': [ np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan]})
ID Name
0 10 NaN
1 11 NaN
2 12 NaN
3 13 NaN
4 14 NaN
5 15 NaN
6 16 NaN
df2 = pd.DataFrame({'ID': ['10', '11', '12'], 'Name': [ 'Peter', 'Bruce', 'Tony']})
ID Name
0 10 Peter
1 11 Bruce
2 12 Tony
df3 = pd.DataFrame({'ID': ['15', '16'], 'Name': ['Wanda', 'Natasha']})
ID Name
0 15 Wanda
1 16 Natasha
What I want to have is data like this:
ID Name
0 10 Peter
1 11 Bruce
2 12 Tony
3 13 NaN
4 14 NaN
5 15 Wanda
6 16 Natasha
I tried this code but it did not work
for id in df2['ID'].unique():
if id in df_main['ID'].unique():
df_main.loc[df_main['ID'] == id, 'Name'] = df2.loc[df2['ID'] == id, 'Name']
for id in df3['ID'].unique():
if id in df_main['ID'].unique():
df_main.loc[df_main['ID'] == id, 'Name'] = df3.loc[df3['ID'] == id, 'Name']
CodePudding user response:
IIUC, you can use concat
with GroupBy.first
:
out = pd.concat([df2, df_main, df3]).groupby("ID", as_index=False).first()
Output :
print(out)
ID Name
0 10 Peter
1 11 Bruce
2 12 Tony
3 13 None
4 14 None
5 15 Wanda
6 16 Natasha
CodePudding user response:
concat
df2
/df3
and map
the values:
df_main['Name'] = df_main['ID'].map(pd.concat([df2, df3]).set_index('ID')['Name'])
Output:
ID Name
0 10 Peter
1 11 Bruce
2 12 Tony
3 13 NaN
4 14 NaN
5 15 Wanda
6 16 Natasha