I have a dataset df1
that looks like this:
fake_id date type value
xxx 12.1.22 A
zzz 12.2.22 A
13.4.22 B 12
Then I have a df2
that looks like this:
name downloads
Name1 23
I want to count all occurrences of all types (eg: A and B).
Then I want to add the the counts to my first dataset. Something like this:
name value count_A count_B
Name1 23 2 1
I was trying this:
df1 = df1.groupby('type').count()
df1_transposed = df1.T
df1_transposed = df1_transposed[['A', 'B']]
df1_transposed = df1_transposed.reset_index()
df2 = pd.merge(df2, df1_transposed, left_index=True, right_index=True)
df2 = df2.drop('index', 1)
and it gives me an output that looks like this:
name value A B
0 Name1 12 2 0
Although the value for group A is correct, the value for B is incorrect. This is probably because there are some NULL values in the fake_id column for type B. Hence, after transposing, it takes the value of 0 instead of 1. How can I fix this?
For example, after this part, the table looks like this:
type A B
fake_id 2 0
date 2 1
value 0 1
CodePudding user response:
You could use the value_counts() function to count occurences of all values in a column.
values_counts = df1.type.value_counts()
a_counts = values_counts.a
b_counts = values_counts.b
df2["count_A"] = [a_counts]
df2["count_B"] = [b_counts]
This should do the trick.