Home > Net >  How to concatenate groups into a new string column in pandas?
How to concatenate groups into a new string column in pandas?

Time:11-29

I'm still pretty new to pandas, and can't quite wrap my head around this one. I have a table that looks like this:

Timestamp Sport Place
10/1/2022 12:05:00 basketball place1
10/1/2022 4:05:00 basketball place1
10/2/2022 16:05 basketball place1
10/2/2022 9:00 basketball place2
10/2/2022 11:00 football place3
10/2/2022 11:15 soccer place1
10/3/2022 11:15 baseball place4
10/3/2022 11:20 tennis place5
10/3/2022 11:30 tennis place5

And I'd like to read have a python script that uses pandas to produce this:

Date Sport Place
1-Oct 2x basketball place1
2-Oct 2x basketball, 1x soccer place1
1x basketball place2
1x football place3
3-Oct 1x baseball place4
2x tennis place5

I can't quite figure out how to do the groupby correctly and then to concatenate the columns.

(And i'm going to want to do this for many similar tables, so i want to use a script.)

This gets me very close:

val_cnts = df.groupby(['date', 'Place'])['Sport'].value_counts()

But this returns a multi-indexed series of just the sport counts, so it looks like I need to turn some of the indices into columns, and then do some concatenations. When I do a .reset_index() on the Place and Sport indices though, that still doesn't work because the groupings don't seem right anymore.

CodePudding user response:

Try:

df["Timestamp"] = pd.to_datetime(df["Timestamp"])
df["Date"] = df["Timestamp"].dt.strftime("%-d-%b")

df = (
    df.groupby(["Date", "Place"])["Sport"]
    .value_counts()
    .to_frame(name="x")
    .reset_index(level=2)
)
df["Sport"] = df["x"].astype(str)   "x "   df["Sport"]
df = df.groupby(level=[0, 1])["Sport"].agg(", ".join).reset_index()

print(df)

Prints:

    Date   Place                     Sport
0  1-Oct  place1             2x basketball
1  2-Oct  place1  1x basketball, 1x soccer
2  2-Oct  place2             1x basketball
3  2-Oct  place3               1x football
4  3-Oct  place4               1x baseball
5  3-Oct  place5                 2x tennis
  • Related