Home > Back-end >  create id number for unique combination of column values
create id number for unique combination of column values

Time:07-13

I have a dataframe like this:

df = pd.DataFrame({"year": [2000,2000,2000,2001,2001,2001], "A": [1,1,0,0,1,0], "B": [4,4,6,10,10,10]})
df
    year    A   B
0   2000    1   4
1   2000    1   4
2   2000    0   6
3   2001    0   10
4   2001    1   10
5   2001    0   10

I would like to create a unique id number for each combination of values of A and B. So the output would something look like the following (starting with an id of 1 and going up):

    year    A   B  ID_AB
0   2000    1   4  1
1   2000    1   4  1
2   2000    0   6  2
3   2001    0   10 3
4   2001    1   10 4
5   2001    0   10 3

Presumably the first step is

g = df.groupby(["A", "B"])

but what is next? Thanks!

CodePudding user response:

Try .groupby followed by .ngroup():

df["ID_AB"] = df.groupby(["A", "B"], sort=False).ngroup()   1
print(df)

Prints:

   year  A   B  ID_AB
0  2000  1   4      1
1  2000  1   4      1
2  2000  0   6      2
3  2001  0  10      3
4  2001  1  10      4
5  2001  0  10      3
  • Related