How to add column(s) for each classification that contains values in a list-CodePudding

I have 12 classifications, that contain multiple codes within (I show only 2 here in this example, dementia & solid tumour)

condition: codes
dementia: F01, F02, F03, F051, G30, G311
solid tumour: C77, C78, C79, C80

I want to be able to add a column for each of these 12 conditions and check whether a patient had any codes for a specific condition and if yes input 1, if no input 0 for that column.

patients = [('pat1', 'C77', 'F01', 'M32', 'M315'),
         ('pat2', 'I099', 'I278', 'M05', 'F01'),
         ('pat3', 'N057', 'N057', 'N058', 'N057')]
labels = ['patient_num', 'DIAGX1', 'DIAGX2', 'DIAGX3', 'DIAGX4']
df_patients = pd.DataFrame.from_records(patients, columns=labels)
df_patients

Input
patient_num DIAGX1  DIAGX2  DIAGX3  DIAGX4
pat1        C77     F01     M32     M315
pat2        I099    I278    M05     F01
pat3        N057    N057    N058    N057

Output
patient_num DIAGX1  DIAGX2  DIAGX3  DIAGX4  dementia_yn  tumour_yn
pat1        C77     F01     M32     M315    1            1
pat2        I099    I278    M05     F01     1            0
pat3        N057    N057    N058    N057    0            0

I have used code before np.select(conditions, values) to create a single column based on conditions but would appreciate help in creating multiple columns dependant on conditions.

CodePudding user response：

You can store the conditions/codes in a dictionary, loop over that, and then use isin any(axis=1) to check if any codes from each condition are in each row of the dataframe:

all_codes = {
    'dementia': ['F01', 'F02', 'F03', 'F051', 'G30', 'G311'],
    'solid_tumour': ['C77', 'C78', 'C79', 'C80'],
}

for condition, codes in all_codes.items():
    df[condition   '_yn'] = df.isin(codes).any(axis=1).astype(int)

Output:

>>> df
  patient_num DIAGX1 DIAGX2 DIAGX3 DIAGX4  dementia_yn  solid_tumour_yn
0        pat1    C77    F01    M32   M315            1                1
1        pat2   I099   I278    M05    F01            1                0
2        pat3   N057   N057   N058   N057            0                0