I have the following dictionary and dataframe:
val_dict = {
'key1': ['val1', 'val2', 'val3'],
'key2': ['val4', 'val5']
}
df = pd.DataFrame(data={'val5': [True, False, False],
'val2': [False, True, False],
'val3': [True, True, False],
'val1': [True, False, True],
'val4': [True, True, False],
'val6': [False, False, True]},
index=pd.Series([1, 2, 3], name='index'))
index | val5 | val2 | val3 | val1 | val4 | val6 |
---|---|---|---|---|---|---|
1 | True | False | True | True | True | False |
2 | False | True | True | False | True | False |
3 | False | False | False | True | False | True |
How do I go through the dataframe from the left so that if the column is True
, other columns in the val_dict
values turn to False
?
index | val5 | val2 | val3 | val1 | val4 | val6 |
---|---|---|---|---|---|---|
1 | True | False | True | FALSE | FALSE | False |
2 | False | True | FALSE | False | True | False |
3 | False | False | False | True | False | True |
For example, index 1 has val5
as True
, so val4
switches to False
because they are both assigned to the same val_dict
key. Similarly, val2
is False
but val3
is True
, so val1
gets turned to False
. Note that it should skip over val6
.
I tried converting df
to a dictionary with df.to_dict('index')
to work with two dictionaries. However, dictionaries are unordered and the order of the columns is important, so I thought it might make the code buggy.
CodePudding user response:
One way is with a combination of assign and mask:
# either val2 or val3 can be True:
com = df.filter(['val2', 'val3']).sum(1).ge(1)
# val2 is the leftmost, so start with that
(df.assign(**df.filter(['val1', 'val3']).mask(df.val2, False))
# next is the combination of val2 and val3
.assign(val1 = lambda df: df.val1.mask(com, False),
val4 = lambda df: df.val4.mask(df.val5, False))
)
Out[84]:
val5 val2 val3 val1 val4 val6
index
1 True False True False False False
2 False True False False True False
3 False False False True False True
Note that val6 is untouched, so the values remain the same.
CodePudding user response:
Here's what I have with trying to convert to a dictionary:
def section_filter(df, section_dict):
result = {}
for index, vals in df.to_dict('index').items():
lst = []
for val in section_dict.values():
lst.append({k:v for k, v in vals.items() if k in val})
for k, v in vals.items():
if k not in [m for mi in section_dict.values() for m in mi]:
lst.append({k: v})
for l in lst:
for i in l:
if l[i]:
l.update({k:False for k in l.keys()})
l[i] = True
break
result[index] = {k: v for d in lst for k, v in d.items()}
return pd.DataFrame.from_dict(result, orient='index', columns=df.columns)
print(df)
print()
print(section_filter(df, val_dict))
val5 val2 val3 val1 val4 val6
index
1 True False True True True False
2 False True True False True False
3 False False False True False True
val5 val2 val3 val1 val4 val6
1 True False True False False False
2 False True False False True False
3 False False False True False True