Home > Mobile >  pandas, update dataframe values ​with a not in the same format dataframe
pandas, update dataframe values ​with a not in the same format dataframe

Time:03-30

i have two dataframes. The second dataframe contains the values ​​to be updated in the first dataframe. df1:

data=[[1,"potential"],[2,"lost"],[3,"at risk"],[4,"promising"]]
df=pd.DataFrame(data,columns=['id','class'])

id  class
1   potential
2   lost
3   at risk
4   promising

df2:

data2=[[2,"new"],[4,"loyal"]]
df2=pd.DataFrame(data2,columns=['id','class'])

id  class
2   new
4   loyal

expected output:

data3=[[1,"potential"],[2,"new"],[3,"at risk"],[4,"loyal"]]
df3=pd.DataFrame(data3,columns=['id','class'])

id  class
1   potential
2   new
3   at risk
4   loyal

The code below seems to be working, but I believe there is a more effective solution.

final=df.append([df2])
final = final.drop_duplicates(subset='id', keep="last")

CodePudding user response:

Your solution is good, here is alternative with concat and added DataFrame.sort_values:

df = (pd.concat([df, df2])
        .drop_duplicates(subset='id', keep="last")
        .sort_values('id', ignore_index=True))
print (df)
   id      class
0   1  potential
1   2        new
2   3    at risk
3   4      loyal

CodePudding user response:

We can use DataFrame.update

df = df.set_index('id')
df.update(df2.set_index('id'))
df = df.reset_index()

Result

print(df)

   id      class
0   1  potential
1   2        new
2   3    at risk
3   4      loyal
  • Related