For example, I have several columns of dates and I want to get the month from them. Is there a way to loop through columns instead of running pd.DatetimeIndex(df['date']).month multiple times? The example below is simplified. The real dataset has many more columns.
import pandas as pd
import numpy as np
np.random.seed(0)
rng_start = pd.date_range('2015-07-24', periods=5, freq='M')
rng_mid = pd.date_range('2019-06-24', periods=5, freq='M')
rng_end = pd.date_range('2022-03-24', periods=5, freq='M')
df = pd.DataFrame({ 'start_date': rng_start, 'mid_date': rng_mid, 'end_date': rng_end })
df
start_date mid_date end_date
0 2015-07-31 2019-06-30 2022-03-31
1 2015-08-31 2019-07-31 2022-04-30
2 2015-09-30 2019-08-31 2022-05-31
3 2015-10-31 2019-09-30 2022-06-30
4 2015-11-30 2019-10-31 2022-07-31
The intended output would be
start_date mid_date end_date start_month mid_month end_month
0 2015-07-31 2019-06-30 2022-03-31 7 6 3
1 2015-08-31 2019-07-31 2022-04-30 8 7 4
2 2015-09-30 2019-08-31 2022-05-31 9 8 5
3 2015-10-31 2019-09-30 2022-06-30 10 9 6
4 2015-11-30 2019-10-31 2022-07-31 11 10 7
CodePudding user response:
Try apply
df[['start_month', 'mid_month', 'end_month']] = df.apply(lambda x : x.dt.month,axis=1)
df
Out[244]:
start_date mid_date end_date start_month mid_month end_month
0 2015-07-31 2019-06-30 2022-03-31 7 6 3
1 2015-08-31 2019-07-31 2022-04-30 8 7 4
2 2015-09-30 2019-08-31 2022-05-31 9 8 5
3 2015-10-31 2019-09-30 2022-06-30 10 9 6
4 2015-11-30 2019-10-31 2022-07-31 11 10 7
CodePudding user response:
You answered your question by saying "loop through columns":
for column in df:
df[column.replace("_date", "_month")] = df[column].dt.month
An alternative solution (a variation of @BENY's):
df[df.columns.str.replace("_date", "_month")] = df.apply(lambda x: x.dt.month, axis=1)