I have a dataframe like this:
df = pd.DataFrame({"year": [2000,2000,2000,2001,2001,2001], "A": [1,1,0,0,1,0], "B": [4,4,6,10,10,10]})
df
year A B
0 2000 1 4
1 2000 1 4
2 2000 0 6
3 2001 0 10
4 2001 1 10
5 2001 0 10
I would like to create a unique id number for each combination of values of A and B. So the output would something look like the following (starting with an id of 1 and going up):
year A B ID_AB
0 2000 1 4 1
1 2000 1 4 1
2 2000 0 6 2
3 2001 0 10 3
4 2001 1 10 4
5 2001 0 10 3
Presumably the first step is
g = df.groupby(["A", "B"])
but what is next? Thanks!
CodePudding user response:
Try .groupby
followed by .ngroup()
:
df["ID_AB"] = df.groupby(["A", "B"], sort=False).ngroup() 1
print(df)
Prints:
year A B ID_AB
0 2000 1 4 1
1 2000 1 4 1
2 2000 0 6 2
3 2001 0 10 3
4 2001 1 10 4
5 2001 0 10 3