Home > Enterprise >  How to count on Check all that apply options?
How to count on Check all that apply options?

Time:05-26

I have a survey which contains several Check all that apply questions. I am using Tableau to do visualization but I failed to split the data into different columns due to the limitation of Tableau Desktop(Seems you need to use Tableau Prep). So I have been thinking about using python to get this done, the table is like this:

Q: Which would you like to know more about? (select all that apply)

id    answer   
001   A; B; C
002   A
003   B; C
004   A; C

So, I'd like to get the result like this:

Answer   Count
 A        3
 B        2
 C        3

Can anyone help?

CodePudding user response:

You can try split the semicolon column to list then explode at last value_counts

out = (df.assign(answer=df['answer'].str.split('; '))
       .explode('answer')
       .value_counts('answer')
       .to_frame('Count')
       .reset_index()))
print(out)

  answer  Count
0      A      3
1      C      3
2      B      2

In case, explode is too new for your pandas, you can try .apply(pd.Series).stack() instead.

out = (df['answer'].str.split('; ')
       .apply(pd.Series).stack()
       .value_counts()
       .to_frame('Count')
       .reset_index()
       .rename(columns={'index': 'Answer'}))

CodePudding user response:

You can split, explode and count the values:

print(df['answer'].str.split('; ').explode().value_counts().to_frame('Count').rename_axis('Answer'))
        Count
Answer       
A           3
C           3
B           2

Alternative using collections.Counter (as explode is not available in your version of Pandas):

from collections import Counter

res = pd.DataFrame.from_dict(Counter(df['answer'].str.split('; ').agg(sum)), orient='index', columns=['Count']).rename_axis('Answer')
  • Related