Home > database >  How to count the number of occurrences on comma delimited column in Python Pandas
How to count the number of occurrences on comma delimited column in Python Pandas

Time:12-04

How can to count the number of occurrences of comma-separated values from the whole list of columns

data frame is like this:

id column
1   
2   1
3   1
4   1,2
5   1,2
6   1,2,4
7   1,2,4
8   1,2,4,6
9   1,2,4,6
10  1,2,4,6,8
11  1,2,4,6,8

Desired output is:

id column count
1.         10
2   1.     7    
3   1.     0
4   1,2.   6
5   1,2.   0
6   1,2,4. 4  
7   1,2,4.  0
8   1,2,4,6. 2
9   1,2,4,6. 0
10  1,2,4,6,8 0
11  1,2,4,6,8 0

Tried this:

df = pd.read_csv('parentsplit/parentlist.csv')
df['count'] = df['parent_list'].str.split(',', expand=True).stack().value_counts()


its not working.

CodePudding user response:

You can do as follows:

df['count'] = df['id'].apply(lambda x: df['column'].fillna('X').str.contains(str(x)).sum())

This is basically counting the number of occurence of each id in the column.

Output:

    id     column  count
0    1       None   10
1    2          1    8
2    3          1    0
3    4        1,2    6
4    5        1,2    0
5    6      1,2,4    4
6    7      1,2,4    0
7    8    1,2,4,6    2
8    9    1,2,4,6    0
9   10  1,2,4,6,8    0
10  11  1,2,4,6,8    0

CodePudding user response:

Split and explode the column, then count the occurrences using value_counts then map the counts onto id column

s = df['column'].str.split(',').explode().value_counts()
df['count'] = df['id'].astype(str).map(s).fillna(0)

    id     column  count
0    1       None   10.0
1    2          1    8.0
2    3          1    0.0
3    4        1,2    6.0
4    5        1,2    0.0
5    6      1,2,4    4.0
6    7      1,2,4    0.0
7    8    1,2,4,6    2.0
8    9    1,2,4,6    0.0
9   10  1,2,4,6,8    0.0
10  11  1,2,4,6,8    0.0
  • Related