i want a new column that contains the amount of times user_id and artist_id are the same, for example if user_id = 0, and artist_id = 10, and it happens 5 times, i want to store number 5 in a column in the 5 rows in which this occurs. This code gives me the value, but I can't store it.
treino.groupby(['user_id', 'artist_id']).count()
CodePudding user response:
IIUC you need a column that represents the size of each group in each row. Then you need to use groupby.transform
.
df["group_size"] = (
df.assign(group_size=1)
.groupby(["user_id", "artist_id"])["group_size"]
.transform("count")
)