Home > Mobile >  How to know time people need to pay for the services I offer
How to know time people need to pay for the services I offer

Time:01-20

I have data that contains id, gender, price, time to pay. For example:

import pandas as pd
df1 = pd.DataFrame({'id': ['1','2','3','4','5','6','7','8'],
                    'gender': ['Male','Female','Male','Female','Male','Female','Male','Male'],
                    'price': [250, 1000,300, 250, 1000, 500, 450, 500],
                    'timeToPay':['0 days 01:20:00','1 days 03:24:02','0 days 12:45:55','0 days 05:38:20','0 days 02:44:12','0 days 11:25:38','1 days 01:11:00','0 days 05:22:00']})

Time to pay, is the time difference between when the customer orders and pays (datatype timedelta64[ns]).

How I can get the best time to pay range for this data, I mean, do people pay between 0-1 hours or 4-6 hours or maybe 1 day - 2 days. I want to know how long people pay for the services I offer.

I try to group by the data based on time to pay but I think it doesn't give information I need

CodePudding user response:

I would go for a histogram. Try it out with different bin size, it depends on the number of rows you have.

If you need to meassure it on different services, you might need to split the date with a group by first and then hist each group.

pandas.DataFrame.hist

Further you could calculate the average, this again depends on your data dist, so basically you need to know your data first.

CodePudding user response:

IIUC,

I modified you code a little bit to make it easier to reproduce.

import pandas as pd
df1 = pd.DataFrame({'id': ['1','2','3','4','5','6','7','8'],
                    'gender': ['Male','Female','Male','Female','Male','Female','Male','Male'],
                    'price': [250, 1000,300, 250, 1000, 500, 450, 500],
                    'timeToPay':[ '0 days 01:20:00'
                                 ,'1 days 03:24:02'
                                 ,'0 days 12:45:55'
                                 ,'0 days 05:38:20'
                                 ,'0 days 02:44:12'
                                 ,'0 days 11:25:38'
                                 ,'1 days 01:11:00'
                                 ,'0 days 05:22:00']})
df1['timeToPay']=df1['timeToPay'].apply(lambda x: pd.Timedelta(x))

now timeToPay is a timedelta, then you may transform timeToPay to hour and day with this snippet.

import math
df1['timeToPay_hour']=df1['timeToPay'].apply(lambda x: math.ceil(x.total_seconds()/(60*60)))
df1['timeToPay_day']=df1['timeToPay'].apply(lambda x: math.ceil(x.total_seconds()/(24*60*60)))

df1

Now, your df1 looks like this

id gender price timeToPay timeToPay_hour timeToPay_day
0 1 Male 250 0 days 01:20:00 2 1
1 2 Female 1000 1 days 03:24:02 28 2
2 3 Male 300 0 days 12:45:55 13 1
3 4 Female 250 0 days 05:38:20 6 1
4 5 Male 1000 0 days 02:44:12 3 1
5 6 Female 500 0 days 11:25:38 12 1
6 7 Male 450 1 days 01:11:00 26 2
7 8 Male 500 0 days 05:22:00 6 1

Then, you may compare with gender with timeToPay_hour like this.

df1[['gender','timeToPay_hour']].hist(bins=5)

Hope this help.

  • Related