Home > OS >  How to concatenate columns in a Dataframe based on the date?
How to concatenate columns in a Dataframe based on the date?

Time:11-12

In a data frame similar to the one below how can I create the Concatenation column based on the date of each activity?

Activity A Activity B Activity C Concatenation
1/1/2022 1/15/2022 2/3/2022 Activity A --> Activity B --> Activity C
1/15/2022 2/3/2022 1/1/2022 Activity C --> Activity A --> Activity B

CodePudding user response:

You can use numpy's argsort:

df2 = df.filter(like='Activity').apply(pd.to_datetime, dayfirst=False)

df['Concatenation'] = list(map(' -> '.join, df2.columns.to_numpy()[np.argsort(df2.to_numpy())]))

Or with pandas only (less efficient):

df['Concatenation'] = (df
 .filter(like='Activity')
 .apply(pd.to_datetime, dayfirst=False)
 .apply(lambda s: ' -> '.join(s.sort_values().index), axis=1)
)

output:

  Activity A Activity B Activity C                           Concatenation
0   1/1/2022  1/15/2022   2/3/2022  Activity A -> Activity B -> Activity C
1  1/15/2022   2/3/2022   1/1/2022  Activity C -> Activity A -> Activity B

CodePudding user response:

Suppose you have the data in a file called sample.csv, I would do lke this

  df = pd.read_csv('sample.csv')
  df
    Activity A  Activity B  Activity C
  0 1/1/2022    1/15/2022   2/3/2022
  1 1/15/2022   2/3/2022    1/1/2022

  df['Concatenation'] = df[['Activity A', 'Activity B', 'Activity C']].apply(lambda x: ' -> '.join(list(x)), axis=1)
  df

    Activity A  Activity B  Activity C  Concatenation
  0 1/1/2022    1/15/2022   2/3/2022    1/1/2022 -> 1/15/2022 -> 2/3/2022
  1 1/15/2022   2/3/2022    1/1/2022    1/15/2022 -> 2/3/2022 -> 1/1/2022
  • Related