Home > other >  Pandas - Make some headers row values
Pandas - Make some headers row values

Time:09-29

I have what I assume might be a complexed ask.

I have a few columns in my dataframe, for each date column I want to store the date headers in the column called "Dates" and then I want to create 2 new columns to store the max and min values

DataFrame

DataFrame

ID Item DateMade_Min DateMade_Max DelDate_Min DelDate_Max ExpDate_Min ExpDate_Max
1 2322 01/01/2020 01/01/2020 05/06/2020 07/06/2020 06/05/2022 09/09/2022
2 4454 03/04/2020 01/01/2021 07/08/2020 31/08/2020 15/12/2022 09/01/2023

Desired Output

ID Item Dates Min Max
1 2322 DateMade 01/01/2020 01/01/2020
1 2322 DelDate 05/06/2020 07/06/2020
1 2322 ExpDate 06/05/2022 09/09/2022
2 4454 DateMade 03/04/2020 01/01/2021
2 4454 DelDate 07/08/2020 31/08/2020
2 4454 ExpDate 15/12/2022 09/01/2023

CodePudding user response:

You can reshape with an intermediate stacking and a MultiIndex:

out = (df
   .set_index(['ID', 'Item'])
   .pipe(lambda d: d.set_axis(d.columns.str.split('_', expand=True), axis=1))
   .stack(0)
   .reset_index().rename(columns={'level_2': 'Dates'})
)

output:

   ID  Item     Dates         Max         Min
0   1  2322  DateMade  01/01/2020  01/01/2020
1   1  2322   DelDate  07/06/2020  05/06/2020
2   1  2322   ExpDate  09/09/2022  06/05/2022
3   2  4454  DateMade  01/01/2021  03/04/2020
4   2  4454   DelDate  31/08/2020  07/08/2020
5   2  4454   ExpDate  09/01/2023  15/12/2022
alternative

Alternatively, you can use the janitor helper module and its pivot_longer function:

# pip install janitor
import janitor

out = df.pivot_longer(
    index=['ID', 'Item'], 
    names_to=('Dates', '.value'), 
    names_sep = '_', 
    sort_by_appearance=True
)

CodePudding user response:

here is one way to doit

# melt the DF 
df2=df.melt(['ID','Item'])

# split the date field on '_' to form two columns

df2[['date','minmax']]=df2['variable'].str.split('_', expand=True)

# use pivot to reformat the resultset
df2.pivot(index=['ID','Item','date'], columns='minmax', values='value').reset_index()

minmax  ID  Item    date    Max     Min
0   1   2322    DateMade    01/01/2020  01/01/2020
1   1   2322    DelDate     07/06/2020  05/06/2020
2   1   2322    ExpDate     09/09/2022  06/05/2022
3   2   4454    DateMade    01/01/2021  03/04/2020
4   2   4454    DelDate     31/08/2020  07/08/2020
5   2   4454    ExpDate     09/01/2023  15/12/2022
  • Related