Home > database >  Python Pandas Concatenate Combinations All Columns Dynamically
Python Pandas Concatenate Combinations All Columns Dynamically

Time:06-10

COL_A COL_B COL_C
PRODUCT_1 UK 1/1/2021

I want to have a table that contains the result

COL_A COL_B COL_C COL_A_COL_B_COL_C COL_A_COL_B COL_A_COL_C COL_B_COL_C
PRODUCT_1 UK 1/1/2021 PRODUCT_1UK1/1/2021 PRODUCT_1UK PRODUCT_11/1/2021 UK1/1/2021

Basically, combinations concatenated from left to right and place them as new columns in the dataset.

I don't need all of the combinations, so for example, if ABC columns are concatenated, this is enough on that combination (BCA, or ACB are not needed).

How can you do that in Python3 with itertools and pandas? The real table contains more than 10 columns to get their combinations.

CodePudding user response:

You can use pandas.concat:

from itertools import combinations

df.join(pd.concat({a '_' b: df[a] df[b]  for a,b in combinations(df, 2)}, axis=1))

output:

       COL_A COL_B     COL_C  COL_A_COL_B        COL_A_COL_C COL_B_COL_C
0  PRODUCT_1    UK  1/1/2021  PRODUCT_1UK  PRODUCT_11/1/2021  UK1/1/2021

Many combinations:

from itertools import combinations

df.join(pd.concat({'_'.join(x): df[x[0]].str.cat(df[list(x[1:])].astype(str),
                                                 sep='')
                   for i in (2, 3)
                   for x in combinations(df, i)}, axis=1))

output:

       COL_A COL_B     COL_C  COL_A_COL_B        COL_A_COL_C COL_B_COL_C  \
0  PRODUCT_1    UK  1/1/2021  PRODUCT_1UK  PRODUCT_11/1/2021  UK1/1/2021   

     COL_A_COL_B_COL_C  
0  PRODUCT_1UK1/1/2021  
  • Related