I have two dataframes, and I need to map different columns in both the dataframe. sharing both dataframe.
Dataframe:-
IN.Type STRA STRB STRC
CRBD 2487 XR XL0054
DFRS 3754 MY XL0684
CRBD 7356 DF XL8911
DFRS 4487 DF XL58999
DFRS 7785 MY XL76568
CRBD 8235 GL XL0635
DFRS 2468 PQ XL4569
DFRS 9735 GR XL7589
CRBD 6486 TY XL5566
DFRS 1023 PQ XL27952
Dataframe2 :-
REF1 REF2
2468 PQ
3754 MY
1023 PQ
9735 NA
2487 XR
7785 MY
7356 DF
8235 NA
6486 TY
4487 NA
What Needed :-
- Firstly need to search Dataframe2["REF2"] in Dataframe["STRB"] and if available then pick the corresponding data available in Dataframe["IN.Type"] and Dataframe["STRC"] for output dataframe.
- Second if Dataframe2["REF2"] == "NA" then need to search Dataframe2["REF1"] in Dataframe["STRA"] and if available then pick the corresponding data available in Dataframe["IN.Type"] and Dataframe["STRC"] for output dataframe.
- Third if Dataframe2["REF2"] not in Dataframe["STRB"] then directly put "NONE" in output dataframe "Type" column.
Output dataframe (Sample) :-
Type RTFN
CRBD XL0054
DFRS XL76568
NONE XL0635
- Data fetched from Dataframe["IN.Type"] will go in Output["Type"].
- Data fetched from Dataframe["STRC"] will go in Output["RFTN"].
CodePudding user response:
Use Series.isin
for filter both DataFrames and set values in DataFrame.loc
:
m1 = Dataframe["STRB"].isin(Dataframe2["REF2"])
m2 = Dataframe["STRA"].isin(Dataframe2.loc[Dataframe2["REF2"].isna(), "REF1"])
#if NA s string
#m2 = Dataframe["STRA"].isin(Dataframe2.loc[Dataframe2["REF2"].eq('NA'), "REF1"])
df = pd.concat([Dataframe.loc[(m1 | m2), ['IN.Type','STRC']].drop_duplicates('IN.Type'),
Dataframe.loc[~m1, ['IN.Type','STRC']].assign(**{'IN.Type':None})])
df.columns=['Type','RTFN']
print (df)
Type RTFN
0 CRBD XL0054
1 DFRS XL0684
5 None XL0635
7 None XL7589