Home > database >  Trying to group by using multiple columns
Trying to group by using multiple columns

Time:11-23

Using Pandas I am trying to do group by for multiple columns and then fill the pandas dataframe where a person name is not present

For Example this is my Dataframe enter image description here

V1  V2  V3  PN

1   10  20  A 

2   10  21  A

3   10  20  C

I have a unique person name list = ['A','B','C','D','E']

Expected Outcome:- enter image description here

V1  V2  V3  PN

1   10  20  A

1   10  20  B

1   10  20  C

1   10  20  D

1   10  20  E

2   10  21  A

2   10  21  B

2   10  21  C

2   10  21  D

2   10  21  E

3   10  20  A

3   10  20  B

3   10  20  C

3   10  20  D

3   10  20  E

I was thinking about trying group by pandas statement but it didnt work out

CodePudding user response:

Try this, using pd.MultiIndex with reindex to create additional rows:

import pandas as pd

df = pd.DataFrame({'Version 1':[1,2,3],
                   'Version 2':[10,10,10],
                   'Version 3':[20,21,20],
                   'Person Name':'A A C'.split(' ')})

p_list = [*'ABCDE']

df.set_index(['Version 1', 'Person Name'])\
  .reindex(pd.MultiIndex.from_product([df['Version 1'].unique(), p_list],
           names=['Version 1', 'Person Name']))\
  .groupby(level=0, group_keys=False).apply(lambda x: x.ffill().bfill())\
  .reset_index()

Output:

    Version 1 Person Name  Version 2  Version 3
0           1           A       10.0       20.0
1           1           B       10.0       20.0
2           1           C       10.0       20.0
3           1           D       10.0       20.0
4           1           E       10.0       20.0
5           2           A       10.0       21.0
6           2           B       10.0       21.0
7           2           C       10.0       21.0
8           2           D       10.0       21.0
9           2           E       10.0       21.0
10          3           A       10.0       20.0
11          3           B       10.0       20.0
12          3           C       10.0       20.0
13          3           D       10.0       20.0
14          3           E       10.0       20.0
  • Related