Home > OS >  how to count number of unique string in a column per row in pandas
how to count number of unique string in a column per row in pandas

Time:10-21

I have this table

no    type
1     123, 234, 345
2     123
3     4567,235
4     

I want to count number of strings in column type for each row, so I want to add new column in which shows the count result. The expected result would be:

no    type                 count
1     123, 234, 345          3
2     123                    1
3     4567,235               2
4     NaN                    0

how can I get the expected result by using pandas?

thanks in advance

CodePudding user response:

Try str.count:

df['count'] = df['type'].str.count(',').add(1).fillna(0)

Output:

   no           type  count
0   1  123, 234, 345    3.0
1   2            123    1.0
2   3       4567,235    2.0
3   4           None    0.0

CodePudding user response:

Try

df['new'] = df['type'].str.split(',').str.len()

CodePudding user response:

A list comprehension might be faster, depending on the data size:

 df.assign(count = [len(ent.split(',')) 
                    if ent else 0 
                    for ent in df.type.fillna('')]
             )

   no           type  count
0   1  123, 234, 345      3
1   2            123      1
2   3       4567,235      2
3   4           None      0

Another option, in terms of speed, could be to convert to an ArrowString Dtype before applying Pandas' string methods.

  • Related