How to concatenate groups into a new string column in pandas?-CodePudding

I'm still pretty new to pandas, and can't quite wrap my head around this one. I have a table that looks like this:

Timestamp	Sport	Place
10/1/2022 12:05:00	basketball	place1
10/1/2022 4:05:00	basketball	place1
10/2/2022 16:05	basketball	place1
10/2/2022 9:00	basketball	place2
10/2/2022 11:00	football	place3
10/2/2022 11:15	soccer	place1
10/3/2022 11:15	baseball	place4
10/3/2022 11:20	tennis	place5
10/3/2022 11:30	tennis	place5

And I'd like to read have a python script that uses pandas to produce this:

Date	Sport	Place
1-Oct	2x basketball	place1
2-Oct	2x basketball, 1x soccer	place1
	1x basketball	place2
	1x football	place3
3-Oct	1x baseball	place4
	2x tennis	place5

I can't quite figure out how to do the groupby correctly and then to concatenate the columns.

(And i'm going to want to do this for many similar tables, so i want to use a script.)

This gets me very close:

val_cnts = df.groupby(['date', 'Place'])['Sport'].value_counts()

But this returns a multi-indexed series of just the sport counts, so it looks like I need to turn some of the indices into columns, and then do some concatenations. When I do a .reset_index() on the Place and Sport indices though, that still doesn't work because the groupings don't seem right anymore.

CodePudding user response：

Try:

df["Timestamp"] = pd.to_datetime(df["Timestamp"])
df["Date"] = df["Timestamp"].dt.strftime("%-d-%b")

df = (
    df.groupby(["Date", "Place"])["Sport"]
    .value_counts()
    .to_frame(name="x")
    .reset_index(level=2)
)
df["Sport"] = df["x"].astype(str)   "x "   df["Sport"]
df = df.groupby(level=[0, 1])["Sport"].agg(", ".join).reset_index()

print(df)

Prints:

    Date   Place                     Sport
0  1-Oct  place1             2x basketball
1  2-Oct  place1  1x basketball, 1x soccer
2  2-Oct  place2             1x basketball
3  2-Oct  place3               1x football
4  3-Oct  place4               1x baseball
5  3-Oct  place5                 2x tennis