Home > OS >  Got NaN when converting from timedelta64 to float
Got NaN when converting from timedelta64 to float

Time:10-29

so I have a data frame that looks like this: enter image description here

I calculated the duration by using the following code:

df['dropoff_time'] = pd.to_datetime(df['tpep_dropoff_datetime'])
df['pickup_time'] = pd.to_datetime(df['tpep_pickup_datetime'])
df['duration'] = df['dropoff_time'] - df['pickup_time']

and I am trying to convert the duration of a taxi ride from timedelta64 to float by using the following code:

df['duration'] = df[:5]['duration'] / np.timedelta64(1, 's')

However,it seems like the second time I run the code above to convert from timedelta64 to float, I keep getting this message: enter image description here

Below is a picture showing the datatypes of each column:

enter image description here

So I am getting the float type for the duration column which is what I want, however, some of them are returning a NaN value as shown in the picture, I don't really understand why I am getting this and how to solve this... Can someone please help?

CodePudding user response:

Problem is you filter first 5 values only by [:5], so subtracted only 3 values and pandas add NaNs for all another rows:

df['duration'] = df[:5]['duration'] / np.timedelta64(1, 's')
                 ^^^^^^^        
                  here

So solution is remove [:5]:

df['duration'] = (df['dropoff_time'] - df['pickup_time'])/ pd.Timedelta("1s")

Or:

df['duration'] = (df['dropoff_time'] - df['pickup_time']).td.total_seconds()
  • Related