Home > OS >  Python/Pandas - How to do calculations with time durations?
Python/Pandas - How to do calculations with time durations?

Time:04-16

My data has a column "duration_time" with time durations for each row (ex: 10:58) containing minutes and seconds.

How can I do calculations of this "duration_time" column (for ex: sum all to get total duration time, mean duration time, etc?)

What Dtype is best suited for this type of time calculations?

The Dtype of my column is listed as an object and doesn't let me perform calculations.

Thank you in advance for your help!

CodePudding user response:

Pandas does have a Timedelta type that is intended to represent durations.

You will have to do some parsing, but there is a to_timedelta function in pandas that will help you (reference here).

def parser(t):
    mins, secs = t.split(":")
    return 60 * mins   secs


tds = pd.to_timedelta(
    df.duration_time.map(parser), 
    unit="seconds",
)

You may find this more suited to the kind of calculation you want to do.

CodePudding user response:

The duration_time column is currently an 'object' dtype, meaning it's stored as strings. To make it so you can do math with it, you need to extract the numeric values. The code below accounts for durations in that format and returns the numeric value in seconds:

def str_to_seconds(str_val):
    str_val = str_val.strip()
    mins, secs = str_val.split(':')
    return int(mins) * 60   float(secs)

your_dataframe['duration_seconds'] = your_dataframe['duration_time'].map(str_to_seconds)
  • Related