Home > OS >  Reshaping pandas data frame based on date information
Reshaping pandas data frame based on date information

Time:10-19

I have the following data frame.

df

Current_date     ID     Current_date 1  Current_Date 2  
19/10/2021       11     0.9                  0.8
19/10/2021       11     0.7                  0.6
19/10/2021       12     1.0                  1.5 
19/10/2021       13     0.8                  0.5

I would like to rearrange this and get the following table. To make it clear, to get the Date column values if it is Current_date 1 just and 1 day on the first column (``Current_date) and if it is Current_date 2``` add two days on the current date. Then create a new column to assign the values.

Current_date     ID     Date                Value  
19/10/2021       11     20/10/2021          0.9               
19/10/2021       11     20/10/2021          0.7
19/10/2021       12     20/10/2021          1.0        
19/10/2021       13     20/10/2021          0.8                                
19/10/2021       11     21/10/2021          0.8           
19/10/2021       11     21/10/2021          0.6
19/10/2021       12     21/10/2021          1.5        
19/10/2021       13     21/10/2021          0.5                

Can anyone help with this in Pandas Python?

CodePudding user response:

Use DataFrame.melt with add timedeltas by extracted by last intgers in columns by Series.str.extract and to_timedelta:

df = df.melt(['Current_date','ID'], var_name='Date', value_name='Value')

td = pd.to_timedelta(df['Date'].str.extract('(\d )$', expand=False).astype(int), unit='D')

df['Date']=pd.to_datetime(df['Current_date'],dayfirst=True).add(td).dt.strftime('%d/%m/%Y')
print (df)
  Current_date  ID        Date  Value
0   19/10/2021  11  20/10/2021    0.9
1   19/10/2021  11  20/10/2021    0.7
2   19/10/2021  12  20/10/2021    1.0
3   19/10/2021  13  20/10/2021    0.8
4   19/10/2021  11  21/10/2021    0.8
5   19/10/2021  11  21/10/2021    0.6
6   19/10/2021  12  21/10/2021    1.5
7   19/10/2021  13  21/10/2021    0.5

CodePudding user response:

One option is with pivot_longer from pyjanitor:

# pip install pyjanitor
import pandas as pd
import janitor

df['Current_date'] = pd.to_datetime(df.Current_date)

(df.rename(columns = {"Current_date 1":"Value 1", 
                      "Current_Date 2":"Value 2"})
   .pivot_longer(index=['Current_date', 'ID'], 
                 names_to=(".value", "Date"), 
                 names_sep=" ")
   .assign(Date = lambda df: df.Current_date   
                             pd.to_timedelta(df.Date.astype(int), unit='D'))
) 
  Current_date  ID       Date  Value
0   2021-10-19  11 2021-10-20    0.9
1   2021-10-19  11 2021-10-20    0.7
2   2021-10-19  12 2021-10-20    1.0
3   2021-10-19  13 2021-10-20    0.8
4   2021-10-19  11 2021-10-21    0.8
5   2021-10-19  11 2021-10-21    0.6
6   2021-10-19  12 2021-10-21    1.5
7   2021-10-19  13 2021-10-21    0.5

You could also achieve this with pd.wide_to_long:

df = df.rename(columns = {"Current_date 1":"Value 1", 
                          "Current_Date 2":"Value 2"})


(pd.wide_to_long(df.reset_index(), 
                 stubnames = 'Value', 
                 i = ['Current_date', 'ID', 'index'], 
                 j= 'Date', 
                 sep=' ')
  .droplevel('index')
  .reset_index()
  .assign(Date = lambda df: df.Current_date   
                            pd.to_timedelta(df.Date.astype(int), unit='D'))
)
  Current_date  ID       Date  Value
0   2021-10-19  11 2021-10-20    0.9
1   2021-10-19  11 2021-10-21    0.8
2   2021-10-19  11 2021-10-20    0.7
3   2021-10-19  11 2021-10-21    0.6
4   2021-10-19  12 2021-10-20    1.0
5   2021-10-19  12 2021-10-21    1.5
6   2021-10-19  13 2021-10-20    0.8
7   2021-10-19  13 2021-10-21    0.5
  • Related