Home > Mobile >  How to add up total for range of columns across rows with associated weights
How to add up total for range of columns across rows with associated weights

Time:05-31

I want to take the dementia_yn & tumour_yn columns and sum them across the row for a patient. But I want to add them up with weights associated for different columns

weights = { "dementia": 2, "tumour_yn": 4 }

patients = [('pat1', 'C77', 'F01', 'M32', 'M315', 1, 1),
         ('pat2', 'I099', 'I278', 'M05', 'F01', 1, 0),
         ('pat3', 'N057', 'N057', 'N058', 'N057', 0, 0)]
labels = ['patient_num', 'DIAGX1', 'DIAGX2', 'DIAGX3', 'DIAGX4', 'dementia_yn', 'tumour_yn']
df_patients = pd.DataFrame.from_records(patients, columns=labels)
df_patients

Input
patient_num DIAGX1  DIAGX2  DIAGX3  DIAGX4  dementia_yn  tumour_yn
pat1        C77     F01     M32     M315    1            1
pat2        I099    I278    M05     F01     1            0
pat3        N057    N057    N058    N057    0            0

Output
Input
patient_num DIAGX1  DIAGX2  DIAGX3  DIAGX4  dementia_yn  tumour_yn  total
pat1        C77     F01     M32     M315    1            1          6
pat2        I099    I278    M05     F01     1            0          2
pat3        N057    N057    N058    N057    0            0          0

CodePudding user response:

Let us try dot

#weights = { "dementia_yn": 2, "tumour_yn": 4 }


df_patients['total'] = df_patients[weights.keys()].dot(list(weights.values()))

CodePudding user response:

Assuming: weights = { "dementia_yn": 2, "tumour_yn": 4 }

Using classical multiplication and sum.

Manuel alignment:

df_patients['total'] = df_patients[list (weights)].mul(weights).sum(1)

print(df_patients)

Automating alignment (works even is there are extra keys in the dictionary):

df_patient['total'] = df_patients.mul(pd.Series(weights)).sum(1)

Output:

  patient_num DIAGX1 DIAGX2 DIAGX3 DIAGX4  dementia_yn  tumour_yn  total
0        pat1    C77    F01    M32   M315            1          1      6
1        pat2   I099   I278    M05    F01            1          0      2
2        pat3   N057   N057   N058   N057            0          0      0

If really you have the dictionary keys without "_yn", using a Series allows you to dynamically add the suffix:

weights = { "dementia": 2, "tumour": 4 }

df_patients['total'] = df_patients.mul(pd.Series(weights).add_suffix('_yn')).sum(1)
  • Related