So I have data similar to this:
import pandas as pd
df = pd.DataFrame({'Order ID':[555,556,557,558,559,560,561,562,563,564,565,566],
'State':["MA","MA","MA","MA","MA","MA","CT","CT","CT","CT","CT","CT"],
'County':["Essex","Essex","Essex","Worcester","Worcester","Worcester","Bristol","Bristol","Bristol","Hartford","Hartford","Hartford"],
'AP':[50,50,75,100,100,125,150,150,175,200,200,225]})
but I need to add a column that shows the mode of AP grouped by State and County. I can get the mode this way:
(df.groupby(['State', 'County']).AP.agg(Mode = (lambda x: x.value_counts().index[0])).reset_index().round(0))
I'm just not sure how I can get that data added to the original data so that it looks like this:
Order ID | State | County | AP | Mode |
---|---|---|---|---|
555 | MA | Essex | 50 | 50 |
556 | MA | Essex | 50 | 50 |
557 | MA | Essex | 75 | 50 |
558 | MA | Worcester | 100 | 100 |
559 | MA | Worcester | 100 | 100 |
560 | MA | Worcester | 125 | 100 |
561 | CT | Bristol | 150 | 150 |
562 | CT | Bristol | 150 | 150 |
563 | CT | Bristol | 175 | 150 |
564 | CT | Hartford | 200 | 200 |
565 | CT | Hartford | 200 | 200 |
566 | CT | Hartford | 225 | 200 |
CodePudding user response:
Use GroupBy.transform
for new column:
df['Mode'] = (df.groupby(['State', 'County']).AP
.transform(lambda x: x.value_counts().index[0]))
Or Series.mode
:
df['Mode'] = df.groupby(['State', 'County']).AP.transform(lambda x: x.mode().iat[0])