Home > Blockchain >  Finding conversion rate from events DataFrame
Finding conversion rate from events DataFrame

Time:04-05

I have the following table:

user id time event
a 1 2021.12.12 10:08:39:399 viewed
a 2 2021.12.12 10:08:39:402 clicked
a 3 2021.12.23 3:43:19:397 viewed
a 4 2021.12.23 3:47:11:131 viewed
a 5 2021.12.30 19:20:31:493 viewed

How would I go about trying to find the conversion rate grouped by each user? By this I mean the percentage of views that are followed up by a click within a certain timeframe (lets say 30s). In this case user a has viewed four times and clicked once, with the click being in the allotted timeframe - giving us a conversion rate of 1/4 = 25%.

I tried doing this by splitting the frame by event then using pd.merge_asof() which works for most cases but sometimes user id's are replaced by nulls and sometimes not all viewed events are carried over into the new table. Would appreciate the help!

CodePudding user response:

Try this:

# Convert the `time` column to Timestamp, which make time calculation a lot easier
df["time"] = pd.to_datetime(df["time"], format="%Y.%m.%d %H:%M:%S:%f")

# Sort the dataframe
df = df.sort_values(["user", "time"])

# Definition of success
success = (
    df["user"].eq(df["user"].shift())   # same user as previous row
    & df["event"].eq("clicked")         # current row is "clicked"
    & df["event"].shift().eq("viewed")  # previous row is "viewed"
    & df["time"].diff().dt.total_seconds().lt(30) # time difference is less than 30 secs
)

# Assemble the result
result = (
    df.assign(is_view=lambda x: x["event"].eq("viewed"), success=success)
    .groupby("user").agg(
        views=("is_view", "sum"),   # count number of views
        success=("success", "sum")  # count number of successes
    ).assign(
        rate=lambda x: x["success"] / x["views"]
    )
)

CodePudding user response:

You could parse clicked to 1 and viewed to 0 and then do a groupby with sum and count aggregations. Afterwards you divide the count column from the sum column and get your result.

df # your data

df["success"] = df["event"].apply(lambda x: 1 if x == "clicked" else 0)

results = df.groupby("user").agg({'success' : ['sum', 'count']})

results["conversion"] = results["sum"] / results["count"]
  • Related