I have the following data frame.
df
Current_date ID Current_date 1 Current_Date 2
19/10/2021 11 0.9 0.8
19/10/2021 11 0.7 0.6
19/10/2021 12 1.0 1.5
19/10/2021 13 0.8 0.5
I would like to rearrange this and get the following table. To make it clear, to get the Date
column values if it is Current_date 1
just and 1 day on the first column (``Current_date) and if it is
Current_date 2``` add two days on the current date. Then create a new column to assign the values.
Current_date ID Date Value
19/10/2021 11 20/10/2021 0.9
19/10/2021 11 20/10/2021 0.7
19/10/2021 12 20/10/2021 1.0
19/10/2021 13 20/10/2021 0.8
19/10/2021 11 21/10/2021 0.8
19/10/2021 11 21/10/2021 0.6
19/10/2021 12 21/10/2021 1.5
19/10/2021 13 21/10/2021 0.5
Can anyone help with this in Pandas Python?
CodePudding user response:
Use DataFrame.melt
with add timedeltas by extracted by last intgers in columns by Series.str.extract
and to_timedelta
:
df = df.melt(['Current_date','ID'], var_name='Date', value_name='Value')
td = pd.to_timedelta(df['Date'].str.extract('(\d )$', expand=False).astype(int), unit='D')
df['Date']=pd.to_datetime(df['Current_date'],dayfirst=True).add(td).dt.strftime('%d/%m/%Y')
print (df)
Current_date ID Date Value
0 19/10/2021 11 20/10/2021 0.9
1 19/10/2021 11 20/10/2021 0.7
2 19/10/2021 12 20/10/2021 1.0
3 19/10/2021 13 20/10/2021 0.8
4 19/10/2021 11 21/10/2021 0.8
5 19/10/2021 11 21/10/2021 0.6
6 19/10/2021 12 21/10/2021 1.5
7 19/10/2021 13 21/10/2021 0.5
CodePudding user response:
One option is with pivot_longer from pyjanitor:
# pip install pyjanitor
import pandas as pd
import janitor
df['Current_date'] = pd.to_datetime(df.Current_date)
(df.rename(columns = {"Current_date 1":"Value 1",
"Current_Date 2":"Value 2"})
.pivot_longer(index=['Current_date', 'ID'],
names_to=(".value", "Date"),
names_sep=" ")
.assign(Date = lambda df: df.Current_date
pd.to_timedelta(df.Date.astype(int), unit='D'))
)
Current_date ID Date Value
0 2021-10-19 11 2021-10-20 0.9
1 2021-10-19 11 2021-10-20 0.7
2 2021-10-19 12 2021-10-20 1.0
3 2021-10-19 13 2021-10-20 0.8
4 2021-10-19 11 2021-10-21 0.8
5 2021-10-19 11 2021-10-21 0.6
6 2021-10-19 12 2021-10-21 1.5
7 2021-10-19 13 2021-10-21 0.5
You could also achieve this with pd.wide_to_long:
df = df.rename(columns = {"Current_date 1":"Value 1",
"Current_Date 2":"Value 2"})
(pd.wide_to_long(df.reset_index(),
stubnames = 'Value',
i = ['Current_date', 'ID', 'index'],
j= 'Date',
sep=' ')
.droplevel('index')
.reset_index()
.assign(Date = lambda df: df.Current_date
pd.to_timedelta(df.Date.astype(int), unit='D'))
)
Current_date ID Date Value
0 2021-10-19 11 2021-10-20 0.9
1 2021-10-19 11 2021-10-21 0.8
2 2021-10-19 11 2021-10-20 0.7
3 2021-10-19 11 2021-10-21 0.6
4 2021-10-19 12 2021-10-20 1.0
5 2021-10-19 12 2021-10-21 1.5
6 2021-10-19 13 2021-10-20 0.8
7 2021-10-19 13 2021-10-21 0.5