Home > Mobile >  How to loop through multiple columns to create multiple new columns in
How to loop through multiple columns to create multiple new columns in

Time:05-24

For example, I have several columns of dates and I want to get the month from them. Is there a way to loop through columns instead of running pd.DatetimeIndex(df['date']).month multiple times? The example below is simplified. The real dataset has many more columns.

import pandas as pd
import numpy as np

np.random.seed(0)
rng_start = pd.date_range('2015-07-24', periods=5, freq='M')
rng_mid  = pd.date_range('2019-06-24', periods=5, freq='M')
rng_end  = pd.date_range('2022-03-24', periods=5, freq='M')

df = pd.DataFrame({ 'start_date': rng_start, 'mid_date': rng_mid, 'end_date': rng_end }) 
df

    start_date  mid_date    end_date
0   2015-07-31  2019-06-30  2022-03-31
1   2015-08-31  2019-07-31  2022-04-30
2   2015-09-30  2019-08-31  2022-05-31
3   2015-10-31  2019-09-30  2022-06-30
4   2015-11-30  2019-10-31  2022-07-31

The intended output would be

    start_date  mid_date    end_date    start_month mid_month   end_month
0   2015-07-31  2019-06-30  2022-03-31  7           6           3
1   2015-08-31  2019-07-31  2022-04-30  8           7           4
2   2015-09-30  2019-08-31  2022-05-31  9           8           5
3   2015-10-31  2019-09-30  2022-06-30  10          9           6
4   2015-11-30  2019-10-31  2022-07-31  11          10          7

CodePudding user response:

Try apply

df[['start_month', 'mid_month', 'end_month']] = df.apply(lambda x : x.dt.month,axis=1)
df
Out[244]: 
  start_date   mid_date   end_date  start_month  mid_month  end_month
0 2015-07-31 2019-06-30 2022-03-31            7          6          3
1 2015-08-31 2019-07-31 2022-04-30            8          7          4
2 2015-09-30 2019-08-31 2022-05-31            9          8          5
3 2015-10-31 2019-09-30 2022-06-30           10          9          6
4 2015-11-30 2019-10-31 2022-07-31           11         10          7

CodePudding user response:

You answered your question by saying "loop through columns":

for column in df:
    df[column.replace("_date", "_month")] = df[column].dt.month

An alternative solution (a variation of @BENY's):

df[df.columns.str.replace("_date", "_month")] = df.apply(lambda x: x.dt.month, axis=1)
  • Related