Home > Software design >  add a column in dataframe based on existing value in another dataframe
add a column in dataframe based on existing value in another dataframe

Time:05-14

I have a dataframe DF3 :

zone_id   combine
0         ABD
10        BCD
20        ABC
30        ABE

and a second dataframe :combinaison_df:

zone_id    combine
0          XYZ
10         BCD
20         ABD
30         ABC
40         DEF

I would like to add a new column DF3_index in combinaison_df dataframe that contains the index of each combine value in DF3 .

Here the example of the expected result :

zone_id    combine  DF3_index
0          XYZ       NaN
10         BCD       1
20         ABD       0
30         ABC       2
40         DEF       NaN

I tryed with this code to add DF3_index column : for i in len(combinaison_df):

  DF3.index[DF3['combine'].str.contains(combinaison_df['combine'][i], regex=False)].tolist()

But I got this error :

      3     DF3.index[DF3['combine'].str.contains(combinaison_df['combine'][i], regex=False)].tolist()

TypeError: 'int' object is not iterable

Can you help me to fix this error?

Thanks

CodePudding user response:

Do you think this will work?:

DF3['DF3_index']=DF3.index

combinaison_df=pd.merge(combinaison_df,DF3[['combine','DF3_index']],
                        on=['combine'],how='left')

Output:

print(combinaison_df)

   zone_id combine  DF3_index
0        0     XYZ        NaN
1       10     BCD        1.0
2       20     ABD        0.0
3       30     ABC        2.0
4       40     DEF        NaN

CodePudding user response:

s = DF3.reset_index(drop=True)["combine"].to_dict()
s = dict(zip(s.values(),s.keys()))
import numpy as np
combinaison_df.apply(lambda x: int(s[x["combine"]]) if x["combine"] in s else np.nan, axis=1)
  • Related