I have the following data frame.
df
ID List_values1 List_values2 A_value B_value C_code
1 [[('A_code', 2), ('B_code', 2)]] (C_code, 4) 1 0 0
2 (B_code, 3) [[('A_code', 2), ('B_code', 2), ('C_code', 4)]] 0 1 1
I would like to check the list elements from List_values1
and List_values2
with column A_value, B_value, and C_code
for their values depending on their name (example A_code and A_value
). For example, from List_values1
column from list values if A_code
has A_value == 0
from corresponding column, I would like to remove this particular element from the list. The same works the others.
My proposed output is something below.
ID List_values1 List_values2 A_value B_value C_code
1 ('A_code', 2) 1 0 0
2 (B_code, 3) [[('B_code', 2),('C_code', 4)]] 0 1 1
Can anyone help with this?
CodePudding user response:
Use custom function for test values in tuples, in list by match values if 1
in columns:
print (df)
ID List_values1 List_values2 \
0 1 [(A_code, 2), (B_code, 2)] (C_code, 4)
1 2 (B_code, 3) [(A_code, 2), (B_code, 2), (C_code, 4)]
A_value B_value C_value
0 1 0 0
v = ['A_value','B_value','C_value']
L = ['List_values1','List_values2']
def f(x):
need = x[v].index[x[v].astype(bool)].str.replace('value','code').tolist()
# print (need)
for c, val in x[L].items():
# print (val)
if isinstance(val, tuple):
x[c] = val if val[0] in need else ''
if isinstance(val, list):
out = [z for z in val if z[0] in need]
x[c] = out[0] if len(out) == 1 else out
return x
df = df.apply(f, axis=1)
print (df)
ID List_values1 List_values2 A_value B_value C_value
0 1 (A_code, 2) 1 0 0
1 2 (B_code, 3) [(B_code, 2), (C_code, 4)] 0 1 1
EDIT:
v = ['A_value','B_value','C_value']
L = ['List_values1','List_values2']
def f(x):
need = x[v].index[x[v].astype(bool)].str.split('_').str[0].tolist()
print (need)
for c, val in x[L].items():
# print (val)
if isinstance(val, tuple):
x[c] = val if val[0].split('_')[0] in need else ''
if isinstance(val, list):
out = [z for z in val if z[0].split('_')[0] in need]
x[c] = out[0] if len(out) == 1 else out
return x
df = df.apply(f, axis=1)