I am working on panda dataframes of this kind:
0 1 2 3
1 {1} p→q {} A
2 {2, 3, 4} ¬q {} A
3 {2, 6} ¬q {} A
4 {3} ¬ (¬p) {} A
5 {4} ¬p {} A
My aim now is to get some general select statement, which is able to find the first line which fulfills the following two conditions:
- Column 1 is equal with some specific value, e.g.
¬q
- Column 0 is a subset of some specific value, e.g.
{5,6,2}
My code so far:
import pandas as pd
example = {1: [{1}, 'p→q', set(), 'A'], 2: [{2,3,4}, '¬q', set(), 'A'], 3: [{2,6}, '¬q', set(), 'A'], 4: [{3}, '¬ (¬p)', set(), 'A'], 5: [{4}, '¬p', set(), 'A']}
origin = {5,6,2}
select = []
example = pd.DataFrame(example).transpose()
print(example)
select.append(example.loc[(example[1] == '¬q' &
example[0].issubset(origin)
)])
print select
Error message in the last statement:
AttributeError: 'Series' object has no attribute 'issubset'
I would appreciate, if you can explain me, how I solve the problem and why it doesn't work with loc, similar to that tutorial?
CodePudding user response:
Use lambda function in Series.apply
and add ()
around first condition:
select.append(example.loc[(example[1] == '¬q') &
example[0].apply(lambda x: x.issubset(origin))
])
print (select)
[ 0 1 2 3
3 {2, 6} ¬q {} A]