Home > Mobile >  Converting Dates in different format in the Row into number of years
Converting Dates in different format in the Row into number of years

Time:01-12

I have a data where it has multiple column representing start and end date of task in different formats. all i want is to create a function that can handle multiple date formats and convert them in number of respective years/months

df_1 :

Name          Task_1                    

Robert      'Nov 2022 - Dec 2022'      
billy       '06/2021 - 06/2022'   
stuart      'NOV 2022 - 2022'

Expected output :

Name          Task_1                    Time Required        

Robert      'Nov 2022 - Dec 2022'           1 months
billy       '06/2021 - 06/2022'             12 months
stuart      '2021 - 2022'                   12 months
`def years_in_range(date_range):
    date_range = pd.to_datetime(date_range, format='%B %Y to %B %Y')
    delta = date_range.iloc[1] - date_range.iloc[0]
    years = delta / timedelta(days=365)
    return years`

used above to get the time required in years but not getting the expected output.

CodePudding user response:

You can split your column in two parts, convert to_datetime and subtract the two to get a timedelta (or integer):

s = df['Task_1'].str.split(r'\s*-\s*')

df['Time Required'] = (pd
   .to_datetime(s.str[1]).dt.to_period('M')
   .sub(pd.to_datetime(s.str[0]).dt.to_period('M'))
   .apply(lambda x: x.n)  # only if you want integers
)

Output:

     Name               Task_1     Time Required
0  Robert  Nov 2022 - Dec 2022        <MonthEnd>
1   billy    06/2021 - 06/2022  <12 * MonthEnds>
2  stuart          2021 - 2022  <12 * MonthEnds>

Or:

s = df['Task_1'].str.split(r'\s*-\s*')

df['Time Required'] = (pd
   .to_datetime(s.str[1])
   .sub(pd.to_datetime(s.str[0]))
)

Output:

     Name               Task_1  Time Required
0  Robert  Nov 2022 - Dec 2022              1
1   billy    06/2021 - 06/2022             12
2  stuart          2021 - 2022             12
  • Related