Pandas. merge/join/concat. Rows into columns-CodePudding

Given data frames similar to the following:

df1 = pd.DataFrame({'Customer':['Customer1','Customer2','Customer3'],
             'Status':[0,1,1]}

Customer        Status
0   Customer1     0
1   Customer2     1
2   Customer3     1

df2 = pd.DataFrame({'Customer':['Customer1','Customer1','Customer1','Customer2','Customer2','Customer3'],
             'Call': ['01-01','01-02','01-03','02-01','03-02','06-01']})

    Customer    Call
0   Customer1   01-01
1   Customer1   01-02
2   Customer1   01-03
3   Customer2   02-01
4   Customer2   03-02
5   Customer3   06-01

What is the most efficient method for me to merge the two into a third data frame in which the rows from df2 become columns added to df1. In the new df each row should be a unique customer and 'Call' from df2 is added as incrementing columns populated by NaN values as required?

I'd like to end up with something like:

    Customer    Status  Call_1  Call_2  Call_3
0   Customer1   0       01-01   01-02   01-03
1   Customer2   1       02-01   03-02   NaN
2   Customer3   1       06-01   NaN     NaN

I assume some combination of stack() and merge() is required but can't seem to figure it out.

Help appreciated

CodePudding user response：

Use DataFrame.join with new DataFrame reshaped by GroupBy.cumcount and Series.unstack:

df = df1.join(df2.set_index(['Customer', df2.groupby('Customer').cumcount().add(1)])['Call']
                 .unstack().add_prefix('Call_'), 'Customer')
print (df)
    Customer  Status Call_1 Call_2 Call_3
0  Customer1       0  01-01  01-02  01-03
1  Customer2       1  02-01  03-02    NaN
2  Customer3       1  06-01    NaN    NaN

CodePudding user response：

First pivot df2 with a cumcount de-duplication, then merge:

out = df1.merge(df2.assign(n=df2.groupby('Customer').cumcount().add(1))
                   .pivot(index='Customer', columns='n', values='Call')
                   .add_prefix('Call_'),
                left_on='Customer', right_index=True)

Output:

    Customer  Status Call_1 Call_2 Call_3
0  Customer1       0  01-01  01-02  01-03
1  Customer2       1  02-01  03-02    NaN
2  Customer3       1  06-01    NaN    NaN