Python: Select values on pandas dataframe operating on set filters?-CodePudding

I am working on panda dataframes of this kind:

           0       1   2  3
1        {1}     p→q  {}  A
2  {2, 3, 4}      ¬q  {}  A
3     {2, 6}      ¬q  {}  A
4        {3}  ¬ (¬p)  {}  A
5        {4}      ¬p  {}  A

My aim now is to get some general select statement, which is able to find the first line which fulfills the following two conditions:

Column 1 is equal with some specific value, e.g. ¬q
Column 0 is a subset of some specific value, e.g. {5,6,2}

My code so far:

import pandas as pd

example = {1: [{1}, 'p→q', set(), 'A'], 2: [{2,3,4}, '¬q', set(), 'A'], 3: [{2,6}, '¬q', set(), 'A'], 4: [{3}, '¬ (¬p)', set(), 'A'], 5: [{4}, '¬p', set(), 'A']}

origin = {5,6,2}

select = []

example = pd.DataFrame(example).transpose()

print(example)


select.append(example.loc[(example[1] == '¬q' &
                           example[0].issubset(origin)
                        )])

print select

Error message in the last statement:

AttributeError: 'Series' object has no attribute 'issubset'

I would appreciate, if you can explain me, how I solve the problem and why it doesn't work with loc, similar to that tutorial?

CodePudding user response：

Use lambda function in Series.apply and add () around first condition:

select.append(example.loc[(example[1] == '¬q') &
                           example[0].apply(lambda x: x.issubset(origin))
                    ])
print (select)
[        0   1   2  3
3  {2, 6}  ¬q  {}  A]