Home > Blockchain >  Getting different Values when using groupby(column)["id"].nunique and trying to add a colu
Getting different Values when using groupby(column)["id"].nunique and trying to add a colu

Time:11-17

I'm trying to count the individual values per group in a dataset and add them as a new column to a table. The first one works, the second one produces wrong values. When I use the following code unique_id_per_column = source_table.groupby("disease").some_id.nunique() I'll get

|    | disease                 | some_id |
|---:|:------------------------|--------:|
|  0 | disease1                |   121   |
|  1 | disease2                |     1   |
|  2 | disease3                |     5   |
|  3 | disease4                |     9   |
|  4 | disease5                |    77   |

These numbers seem to check out, but I want to add them to another table where I have already a column with all values per group. So I used the following code table["unique_ids"] = source_table.groupby("disease").uniqe_id.transform("nunique") and I get the following table, with wrong numbers for every row except the first.

|    | disease                 |some_id |   unique_ids      |
|---:|:------------------------|-------:|------------------:|
|  0 | disease1                |   151  |               121 |
|  1 | disease2                |     1  |               121 |
|  2 | disease3                |     5  |               121 |
|  3 | disease4                |     9  |               121 |
|  4 | disease5                |    91  |               121 |

I've expected that I will get the same results as in the first table. Anyone knows why I get the number for the first row repeated instead of correct numbers?

CodePudding user response:

Solution with Series.map if need create column in another DataFrame:

s = source_table.groupby("disease").some_id.nunique()

table["unique_ids"] = table["disease"].map(s) 
  • Related