Home > Enterprise >  How to use pandas to count the value from multiple column at once
How to use pandas to count the value from multiple column at once

Time:10-12

I learned that when I type df = pd.read_csv("file.csv"), I can bring this data from csv file like below :

    round        date  first  second  third  fourth  fifth  sixth  bonus
0      959  2021.04.17      1      14     15      24     40     41     35
1      958  2021.04.10      2       9     10      16     35     37      1
2      957  2021.04.03      4      15     24      35     36     40      1
3      956  2021.03.27     10      11     20      21     25     41     40
4      955  2021.03.20      4       9     23      26     29     33      8

Also I learned that I can use df["column_name"].value_counts() to count how frequently the value wrote.

For example, df["first"].value_count(), then it will show the 'first' column's count value as a result. like

1     134
2     107
3      94 ...

But I want to count the all the value in multiple column at once.

How can I count the value from multiple column at once? Can some tell me which method to use?

CodePudding user response:

If you want a matrix of all present values in all columns from start to end, use value_counts as suggested by @jezrael:

>>> df.iloc[:, 2:].apply(pd.value_counts).fillna(0).astype(int)

    first  second  third  fourth  fifth  sixth  bonus
1       1       0      0       0      0      0      2
2       1       0      0       0      0      0      0
4       2       0      0       0      0      0      0
8       0       0      0       0      0      0      1
9       0       2      0       0      0      0      0
10      1       0      1       0      0      0      0
11      0       1      0       0      0      0      0
14      0       1      0       0      0      0      0
15      0       1      1       0      0      0      0
16      0       0      0       1      0      0      0
20      0       0      1       0      0      0      0
21      0       0      0       1      0      0      0
23      0       0      1       0      0      0      0
24      0       0      1       1      0      0      0
25      0       0      0       0      1      0      0
26      0       0      0       1      0      0      0
29      0       0      0       0      1      0      0
33      0       0      0       0      0      1      0
35      0       0      0       1      1      0      1
36      0       0      0       0      1      0      0
37      0       0      0       0      0      1      0
40      0       0      0       0      1      1      1
41      0       0      0       0      0      2      0

CodePudding user response:

You can select all columns without first 2 and count values, last replace NaNs and convert to integers:

df.iloc[:, 2:].apply(pd.value_counts).fillna(0).astype(int)

Alternative with different format (transpose first solution):

df.iloc[:, 2:].apply(pd.value_counts, axis=1).fillna(0).astype(int)
  • Related