I have two dataframes .
df_patient = pd.DataFrame([['P1'], ['P2'], ['P1'], ['P3']], columns = ['Patient']) df_KPI = pd.DataFrame([['KPI1'], ['KPI2'], ['KPI3'], ['KPI4']], columns = ['KPI'])
Now I need a dataframe df_output which will contain All the indivual KPI's from df_KPI for each of the unique patient from df_patient? My final df should look like this ;
df_output = pd.DataFrame([['P1', 'KPI1'], ['P1', 'KPI2'], ['P1', 'KPI3'] ,['P1', 'KPI4'], ['P2', 'KPI1'], ['P2', 'KPI2'], ['P2', 'KPI3'] ,['P2', 'KPI4'], ['P3', 'KPI1'], ['P3', 'KPI2'], ['P3', 'KPI3'] ,['P3', 'KPI4'] ], columns = ['Patient', 'KPI'])
How can I achive that? TIA.
CodePudding user response:
Use DataFrame.drop_duplicates
with cross join by DataFrame.merge
:
df = df_patient.drop_duplicates('Patient').merge(df_KPI, how='cross')
print (df)
Patient KPI
0 P1 KPI1
1 P1 KPI2
2 P1 KPI3
3 P1 KPI4
4 P2 KPI1
5 P2 KPI2
6 P2 KPI3
7 P2 KPI4
8 P3 KPI1
9 P3 KPI2
10 P3 KPI3
11 P3 KPI4