Home > Enterprise >  Merging two data frames based on a unique col value
Merging two data frames based on a unique col value

Time:09-23

I have two dataframes .

df_patient = pd.DataFrame([['P1'], ['P2'], ['P1'], ['P3']], columns = ['Patient'])

df_KPI = pd.DataFrame([['KPI1'], ['KPI2'], ['KPI3'], ['KPI4']], columns = ['KPI'])

Now I need a dataframe df_output which will contain All the indivual KPI's from df_KPI for each of the unique patient from df_patient? My final df should look like this ;

df_output = pd.DataFrame([['P1', 'KPI1'], ['P1', 'KPI2'], ['P1', 'KPI3'] ,['P1', 'KPI4'],
                          ['P2', 'KPI1'], ['P2', 'KPI2'], ['P2', 'KPI3'] ,['P2', 'KPI4'],
                          ['P3', 'KPI1'], ['P3', 'KPI2'], ['P3', 'KPI3'] ,['P3', 'KPI4']
                         ], columns = ['Patient', 'KPI'])

How can I achive that? TIA.

CodePudding user response:

Use DataFrame.drop_duplicates with cross join by DataFrame.merge:

df = df_patient.drop_duplicates('Patient').merge(df_KPI, how='cross')
print (df)
   Patient   KPI
0       P1  KPI1
1       P1  KPI2
2       P1  KPI3
3       P1  KPI4
4       P2  KPI1
5       P2  KPI2
6       P2  KPI3
7       P2  KPI4
8       P3  KPI1
9       P3  KPI2
10      P3  KPI3
11      P3  KPI4
  • Related