Home > Software design >  Subtracting dates in Python for Gantt chart
Subtracting dates in Python for Gantt chart

Time:04-20

I am following a tutorial to make a Gantt chart with this tutorial: https://towardsdatascience.com/gantt-charts-with-pythons-matplotlib-395b7af72d72

I have tried to recreate part of the test dataset with the following script:

import pandas as pd   

data = [['TSK M', 'IT', '2022-03-17',  '2022-03-20', '0.0'], ['TSK N', 'MKT', '2022-03-17', '2022-03-19',  '0.0']]    

df = pd.DataFrame(data, columns = ['Task', 'Department',  'Start', 'End', 'Completion'])

Then processing the dataframe through the first part of the tutorial, I end up with and error message:

proj_start = df['Start'].min()

df['start_num'] = (df.Start-proj_start).dt.days

TypeError: unsupported operand type(s) for -: 'str' and 'str'

I have tried to convert the data in integer with the function int(), but the error persist. Would anyone know what is wrong here?

CodePudding user response:

You need to convert date column to datetime type first

df['Start'] = pd.to_datetime(df['Start'])
df['End'] = pd.to_datetime(df['End'])

# Or

df[['Start', 'End']] = df[['Start', 'End']].apply(pd.to_datetime)
  • Related