Home > Mobile >  Pandas finding mating entries on rows and merging columns in those rows into one
Pandas finding mating entries on rows and merging columns in those rows into one

Time:02-24

I have a data field like this.

Index Product Purchase_Address Order_Date
0 A 604 Cherry st, Dallas 2019-10-28
1 B 225 5th st, LA 2019-10-29
2 C 604 Cherry st, Dallas 2019-10-28
3 D 225 5th st, LA 2019-10-29
4 E 967 12th st, NY 2019-10-27
5 F 967 12th st, NY 2019-10-27
6 A 628 Jefferson St, NY 2019-10-20
7 B 628 Jefferson St, NY 2019-10-20
8 A 694 Meadow St, Atlanta 2019-10-25
9 B 694 Meadow St, Atlanta 2019-10-25
10 C 27 Wilson St, Austin 2019-10-26
11 D 27 Wilson St, Austin 2019-10-26

I need to make a new data field where I would merge the products into a single column if the address and order date are the same (meaning they where ordered at the same time).

The df should look something like this:

Index Product Purchase_Address
0 A, C 604 Cherry st, Dallas
1 B, D 225 5th st, LA
2 E, F 967 12th st, NY
3 A, B 628 Jefferson St, NY
4 A, B 694 Meadow St, Atlanta
5 C, D 27 Wilson St, Austin

And then from that a df, where I count the number of times a combination has happened:

Index Product_Combination Nr_Of_Times
0 A, C 1
1 B, D 1
2 E, F 1
4 A, B 2
5 C, D 1

How would I achieve something like this? Thanks!

CodePudding user response:

Use Groupby.agg with Groupby.count and Series.to_frame:

In [1783]: out = df.groupby(['Purchase_Address', 'Order_Date']).agg({'Product': ','.join}).groupby('Product')['Product'].count().to_frame('Nr_Of_Times').reset_index()

In [1784]: out
Out[1784]: 
  Product  Nr_Of_Times
0     A,B            2
1     A,C            1
2     B,D            1
3     C,D            1
4     E,F            1
  • Related