Is there a way to concatenate columns by groups of 2 same values in pandas-CodePudding

Say i have something like this in a pandas dataframe :

Entity	Type	Doc	Proj
Daniel	PER	1	1
Daniel	PER	4	2
Daniel	PER	5	3
Daniel	PER	9	6
Daniel	LOC	7	4
905-888-8988	ID	3	1
905-888-8988	ID	4	2
905-888-8988	ID	14	8

For each combo of entity and type that is recurring, i'd like to add two new columns for the doc and proj corresponding to the match. I'd like to do this to all possible match between combo of entity.

Edit 1 More detailed explanation to get to the expected outcome

Step 1 - Identify if an "Entity" and "Type" combo has more than 1 occurence in the dataframe.

Step 2 - For each combo that have more than 1 occurence, i would need to represent all possible combination of "Doc" and "Proj" of the combos.

Step 3 - All these possible combination should be represented in pairs of "doc" and "proj"

So the result would look like this in the pandas dataframe :

Entity	Type	Doc1	Proj1	Doc2	Proj2
Daniel	PER	1	1	4	2
Daniel	PER	1	1	5	3
Daniel	PER	1	1	9	6
Daniel	PER	4	2	5	3
Daniel	PER	4	2	9	6
Daniel	PER	5	3	9	6
905-888-8988	ID	3	1	4	2
905-888-8988	ID	3	1	14	8
905-888-8988	ID	4	2	14	8

Thanks all for the help

CodePudding user response：

here is one way:

reset_index to copy the index as a column
use merge to join df with itself on entity and type columns
remove the duplicate pairs by only keeping the smaller index of original right side of merge.

df = df.reset_index()
res = pd.merge(df, df, on=['Entity', 'Type'],suffixes=['1', '2'])
res = res.loc[res.index1 < res.index2].drop(columns=['index1', 'index2']).reset_index(drop=True)

output:

>>
         Entity Type  Doc1  Proj1  Doc2  Proj2
0        Daniel  PER     1      1     4      2
1        Daniel  PER     1      1     5      3
2        Daniel  PER     1      1     9      6
3        Daniel  PER     4      2     5      3
4        Daniel  PER     4      2     9      6
5        Daniel  PER     5      3     9      6
6  905-888-8988   ID     3      1     4      2
7  905-888-8988   ID     3      1    14      8
8  905-888-8988   ID     4      2    14      8