Home > database >  Checking list elements in Pandas column with other corresponding column: Pandas
Checking list elements in Pandas column with other corresponding column: Pandas

Time:02-21

I have the following data frame.

df

ID        List_values1                           List_values2                     A_value   B_value   C_code   
1    [[('A_code', 2), ('B_code', 2)]]            (C_code, 4)                          1         0         0         
2    (B_code, 3)                      [[('A_code', 2), ('B_code', 2), ('C_code', 4)]] 0         1         1         

I would like to check the list elements from List_values1 and List_values2 with column A_value, B_value, and C_code for their values depending on their name (example A_code and A_value). For example, from List_values1 column from list values if A_code has A_value == 0 from corresponding column, I would like to remove this particular element from the list. The same works the others.

My proposed output is something below.

ID        List_values1                           List_values2                  A_value   B_value   C_code   
1        ('A_code', 2)                                                             1         0         0         
2         (B_code, 3)               [[('B_code', 2),('C_code', 4)]]                0         1         1         

Can anyone help with this?

CodePudding user response:

Use custom function for test values in tuples, in list by match values if 1 in columns:

print (df)
   ID                List_values1                             List_values2  \
0   1  [(A_code, 2), (B_code, 2)]                              (C_code, 4)   
1   2                 (B_code, 3)  [(A_code, 2), (B_code, 2), (C_code, 4)]   

   A_value  B_value  C_value  
0        1        0        0  

v = ['A_value','B_value','C_value']

L = ['List_values1','List_values2']

def f(x):
    need = x[v].index[x[v].astype(bool)].str.replace('value','code').tolist()
    # print (need)
    
    for c, val in x[L].items():
        # print (val)
        if isinstance(val, tuple):
            x[c]  = val if val[0] in need else ''
        if isinstance(val, list):
            out  = [z for z in val if z[0] in need]
            x[c] = out[0] if len(out) == 1 else out
    return x
            

df = df.apply(f, axis=1)
print (df)
   ID List_values1                List_values2  A_value  B_value  C_value
0   1  (A_code, 2)                                    1        0        0
1   2  (B_code, 3)  [(B_code, 2), (C_code, 4)]        0        1        1

EDIT:

v = ['A_value','B_value','C_value']

L = ['List_values1','List_values2']

def f(x):
    need = x[v].index[x[v].astype(bool)].str.split('_').str[0].tolist()
    print (need)
    
    for c, val in x[L].items():
        # print (val)
        if isinstance(val, tuple):
            x[c]  = val if val[0].split('_')[0] in need else ''
        if isinstance(val, list):
            out  = [z for z in val if z[0].split('_')[0] in need]
            x[c] = out[0] if len(out) == 1 else out
    return x
            

df = df.apply(f, axis=1)
  • Related