What is a pythonic way of counting level 1 size per level 0 in multi index and creating a new column (named counts
). I can achieve this in the following way but would like to gain an understanding of any simpler approaches:
Code
df = pd.DataFrame({'STNAME':['AL'] * 3 ['MI'] * 4,
'CTYNAME':list('abcdefg'),
'COL': range(7) }).set_index(['STNAME','CTYNAME'])
print(df)
COL
STNAME CTYNAME
AL a 0
b 1
c 2
MI d 3
e 4
f 5
g 6
df1 = df.groupby(level=0).size().reset_index(name='count')
counts = df.merge(df1,left_on="STNAME",right_on="STNAME")["count"].values
df["counts"] = counts
This is the desired output:
COL counts
STNAME CTYNAME
AL a 0 3
b 1 3
c 2 3
MI d 3 4
e 4 4
f 5 4
g 6 4
CodePudding user response:
You can use groupby.transform
with size here instead of merging:
output = df.assign(Counts=df.groupby(level=0)['COL'].transform('size'))
print(output)
COL Counts
STNAME CTYNAME
AL a 0 3
b 1 3
c 2 3
MI d 3 4
e 4 4
f 5 4
g 6 4