I have a df that contains four columns:
col1 col2 col3 col4
0 2021-10-18 2021-10-27 2021-11-08 2021-11-21
1 2021-01-30 2021-02-01 2021-02-28 2021-03-18
2 2021-08-17 2021-08-20 2021-09-02 2021-09-07
I would like to count the number of businessdays between the columns. So far I did this:
col1 = [d.date() for d in df['col1']]
col2 = [d.date() for d in df['col2 ']]
col3 = [d.date() for d in df['col3']]
col4 = [d.date() for d in df['col4 ']]
df['bday1'] = np.busday_count(col1, col2)
df['bday2'] = np.busday_count(col2, col3)
df['bday3'] = np.busday_count(col3, col4)
Is there a way to do this more efficiently? For example, for a case when there would be more columns? I'd be grateful for your help!
CodePudding user response:
Try this:
cols = ['col1', 'col2', 'col3', 'col4']
for i, (col1, col2) in enumerate(zip(cols[:-1], cols[1:])):
df[f'bday{i 1}'] = np.busday_count(
df[col1].dt.date.to_list(),
df[col2].dt.date.to_list(),
)
You just need to specify the columns you want to use.