I am pretty new to Python and doing some project work on my own. Hence need a little help to understand a few things.
I have a DataFrame that contains Netflix Data.
what I need to do is to Find out the Sum of DURATION column for each Profile Name i.e want to know who watches Netflix the most.
How can I add the duration Column? I am unable to understand the to_timedelta function.
CodePudding user response:
You can use a combination of to_timedelta
and GroupBy.sum
:
out = (pd.to_timedelta(df['Duration']) # convert strings to timedelta
.groupby(df['Profile Name']).sum() # sum per Profile
.sort_values(ascending=False) # sort by total duration
)
print(out)