count word frequency with groupby-CodePudding

I have a csv file only one tag column:

tag
A
B
B
C
C
C
C

When run groupby to count the word frequency, the output do not have the frequency number

#!/usr/bin/env python3
import pandas as pd

def count(fname):
    df = pd.read_csv(fname)
    print(df)
    dfg = df.groupby('tag').count().reset_index()
    print(dfg)
    return
count("save.txt")

Output no frequency column:

  tag
0   A
1   B
2   B
3   C
4   C
5   C
6   C
  tag
0   A
1   B
2   C

expect output:

  tag  freq
0   A  1
1   B  2
2   C  4

CodePudding user response：

Looks close to me, per my comment:

df = pd.DataFrame({'tag': ['A', 'B', 'B', 'C', 'C', 'C', 'C']})

df.groupby(['tag'], as_index=False).agg(freq=('tag', 'count'))

CodePudding user response：

You could create the addtional column then count values:

Input:

df['freq'] = 1
df = df['tag'].value_counts()

Output:

    tag freq
0     C    4
1     B    2
2     A    1

CodePudding user response：

You should use value_counts() and not count()

df.groupby("tag").value_counts().reset_index().rename(columns={0: "freq"})

outputs:

  tag  freq
0   A     1
1   B     2
2   C     4

To sort in descending order,

df.groupby("tag").value_counts().reset_index().rename(columns={0: "freq"}).sort_values(
    by="freq", ascending=False
)