I need to compare a certain range of value of a column with a specific string in another column to update the date column with 1,7, 30 days
Col A ColB colc
5 Internal 1-1-2022
7 external 1-1-2022
4 external 1-1-2022
3 external 1-1-2022
so if df[ColA]<8
and df[ColB]=External
add 7 days to df[colc]. if if df[ColA]>=8
and df[ColB]=External
add 30 days to df[colc] and if df[ColB]=Internal
add 120 days to df[Colc]
CodePudding user response:
You can use numpy.select
to handle all your conditions:
I simplified them to:
- if ColB is "Internal" -> add 120 days
- else, if ColA is ≤ 8 -> add 7 days
- else -> add 30 days
factor = np.select([df['ColB'].eq('Internal'), df['Col A'].le(8)], [120, 7], 30)
df['new'] = (pd.to_datetime(df['colc'], dayfirst=True)
.add(pd.DateOffset(days=1)*a)
.dt.strftime('%-d-%-m-%-Y')
)
output:
Col A ColB colc new
0 5 Internal 1-1-2022 3-1-2022
1 7 external 1-1-2022 6-1-2022
2 4 external 1-1-2022 1-1-2022
3 3 external 1-1-2022 3-1-2022
CodePudding user response:
Using the same np.select
as @mozway, I would suggest using pd.Timedelta
.
df = pd.DataFrame(
{
'Col A': [5, 7, 4, 3],
'ColB': ['Internal', 'external', 'external', 'external'],
'colc': pd.to_datetime("1/1/2022", format="%m/%d/%Y")
}
)
df['colc'] = [
(a pd.Timedelta(b, 'd')) for (a, b) in
zip(df.colc, np.select([df['ColB'].eq('Internal'), df['Col A'].le(8)], [120, 7], 30))
]
print(df)
Result:
Col A ColB colc
0 5 Internal 2022-05-01
1 7 external 2022-01-08
2 4 external 2022-01-08
3 3 external 2022-01-08