I have 2 dfs
df_1
Nº.do Incidente Status Description Per_Extracao
0 IN6948271 ENCERRADO GR26 D.I.T.I. >>> ABEND NO JOB PP_SASG_GD9822... DE : 2022/01/05 ATÉ : 2022/12/08
1 IN6948304 ENCERRADO GR26 D.I.T.I. >>> ABEND NO JOB PP_AACE_R4539 ... DE : 2022/01/05 ATÉ : 2022/12/08
2 IN6948307 ENCERRADO GR26 D.I.T.I. >>> ABEND NO JOB PP_ADAT_SPRK_EX... DE : 2022/01/05 ATÉ : 2022/12/08
3 IN6948309 ENCERRADO GR26 D.I.T.I. >>> ABEND NO JOB PP_ADAT_SPRK_EX... DE : 2022/01/05 ATÉ : 2022/12/08
4 IN6948310 ENCERRADO GR26 D.I.T.I. >>> ABEND NO JOB PP_ADAT_SPRK_EX... DE : 2022/01/05 ATÉ : 2022/12/08
5 IN6948311 ENCERRADO GR26 D.I.T.I. >>> ABEND NO JOB PP_ADAT_SPRK_EX... DE : 2022/01/05 ATÉ : 2022/12/08
df_2
JOB_NAME JOB_STREAM_NAME
0 PP_AACD_NR_D8706_TIHIBRIDA_PROC_EXCUC_D P26_AACD_FAC_TOD
1 PP_SASG_GD9822 P26_AACE_U08
2 PP_AACE_R4539 P26_AACE_U09
3 PP_AACE_R4539_CONS_JUNC P26_AACE_U08
4 PP_AACE_R4539_FMRC_TD_01 P26_AACE_U08
5 PP_AACE_R4539_FMRC_TD_02 P26_AACE_U08
I'm trying to merge then based on the value of JOB_NAME in df_2 founded in df_1(Description columns), it's a kind of a left join on sql, because I need everything from df_2
the output should be something like this
merged_df
Nº.do Incidente Status Description Per_Extracao JOB_NAME JOB_STREAM_NAME
0 IN6948271 ENCERRADO GR26 D.I.T.I. >>> ABEND NO JOB PP_SASG_GD9822... DE : 2022/01/05 ATÉ : 2022/12/08 PP_SASG_GD9822 P26_AACE_U08
1 IN6948304 ENCERRADO GR26 D.I.T.I. >>> ABEND NO JOB PP_AACE_R4539 ... DE : 2022/01/05 ATÉ : 2022/12/08 PP_AACE_R4539 P26_AACE_U09
2 NaN NaN NaN NaN PP_AACD_NR_D8706_TIHIBRIDA_PROC_EXCUC_D P26_AACD_FAC_TOD
3 NaN NaN NaN NaN PP_AACE_R4539_CONS_JUNC P26_AACE_U08
4 NaN NaN NaN NaN PP_AACE_R4539_FMRC_TD_01 P26_AACE_U08
5 NaN NaN NaN NaN PP_AACE_R4539_FMRC_TD_02 P26_AACE_U08
tried to use some merge commands, but, none has worked, for example:
df_test = df_2.merge(df_2,left_on='JOB_NAME',right_on=df_1['Description'].str.extract(df_2['JOB_NAME']),how='left')
and
df_test = df_2.merge(df_2,left_on='JOB_NAME',right_on=df_1['Description'].str.contains(df_2['JOB_NAME']),how='left')
CodePudding user response:
You can use the same answer as in the other question, just add the keyword right
to it on how the merge should be done.
pat = "|".join(df2['JOB_NAME'])
df1.insert(0, 'merge_key', df1['Description'].str.extract("(" pat ")", expand=False))
df = df1.merge(df2, how='right', left_on='merge_key', right_on='JOB_NAME').drop('merge_key', axis=1)
print(df)
Nº.do Incidente Status Description jobName Per_Extracao JOB_NAME JOB_STREAM_NAME
0 NaN NaN NaN NaN NaN PP_AACD_NR_D8706_TIHIBRIDA_PROC_EXCUC_D P26_AACD_FAC_TOD
1 ENCERRADO GR26 D.I.T.I. >>> ABEND NO JOB PP_SASG_GD9822... NaN DE : 2022/01/05 ATÉ : 2022/12/08 PP_SASG_GD9822 P26_AACE_U08
2 ENCERRADO GR26 D.I.T.I. >>> ABEND NO JOB PP_AACE_R4539 ... NaN DE : 2022/01/05 ATÉ : 2022/12/08 PP_AACE_R4539 P26_AACE_U09
3 NaN NaN NaN NaN NaN PP_AACE_R4539_CONS_JUNC P26_AACE_U08
4 NaN NaN NaN NaN NaN PP_AACE_R4539_FMRC_TD_01 P26_AACE_U08
5 NaN NaN NaN NaN NaN PP_AACE_R4539_FMRC_TD_02 P26_AACE_U08