I have a sqlite table called "word_table" shown below
id | word |
---|---|
1 | hi |
1 | bye |
1 | hello |
2 | this |
2 | hello |
2 | this |
3 | that |
3 | how |
How do i use pandas to create a dataframe with the output below
id | word | count |
---|---|---|
1 | hi | 1 |
1 | bye | 1 |
1 | hello | 1 |
2 | this | 2 |
2 | hello | 1 |
3 | that | 1 |
3 | how | 1 |
It computes the number of times each word appears in each id and store it in a new column "count".
CodePudding user response:
You can combine .groupby()
.value_counts()
:
df["count"] = df.groupby("id")["word"].transform(
lambda x: x.map(x.value_counts())
)
print(df.drop_duplicates())
Prints:
id word count
0 1 hi 1
1 1 bye 1
2 1 hello 1
3 2 this 2
4 2 hello 1
6 3 that 1
7 3 how 1