Python Count comma delimited list based on if row in other column equals x-CodePudding

I have a data set of two columns where when columnA = 1 then in ColumnB I want to count the number of occurrences of x that is comma delimited

Sample data set that is within an excel file

columnA    columnB
1          x,a,b,c
2          d,e,g
3          a,r,x
4          y,x,o,a

What I've tried

if any ('1' in str(x) for x in excel_file['columnA']):
    count = excel_file['columnB'].astype(str).str.contains('x').value_counts()[True]
else:
    count = 0

This does get me the number of occurrences of x but it gets me all occurrences and not only when columnA is equal to 1 in the same row.

I know in excel it could be written as countifs=(columnA, "1", columnB, "*x*") but can't seem to find a similar way of doing this within python

Any help would be greatly appreciated!

CodePudding user response：

While waiting for more completed sample data and expected outcome, I'd provide my preliminary answer:

Assuming this is how your data look:

import pandas as pd
data = [[1, ['x', 'a', 'b', 'c']], [2, ['d', 'e', 'g']], [3, ['a', 'r', 'x']], [4, ['y', 'x', 'o', 'a']]]
excel_file = pd.DataFrame(data, columns=['columnA', 'columnB'])

Usually, I'd do your task with a for loop by row:

for idx, row in excel_file.iterrows():
    if row['columnA'] == 1:
        count = 0
        for item in row['columnB']:
            if item == 'x':
                count  = 1
        print('Row', idx, 'has', count, 'occurrence(s) of x')

CodePudding user response：

Looks like this was actually able to be solved pretty easily using len.

filter = excel_file[excel_file['columnA'].astype(str).str.contains('1')
filter_2 = len([filter[filter['columnB'].astype(str).str.contains('x')])

I had tried something similar earlier but I think separating it out into multiple functions rather than one was able to solve this pretty easily.

Thanks Yee for trying to provide an answer!