OK, I have a dataframe that looks like the following:
>> df
id trip_id segment_id session_id start_timestamp lat_start lon_start lat_end lon_end travelmode
563097015 563097 15 128618 2017-05-20 17:47:12 01 41.1783308 -8.5949878 41.1784478 -8.5948463 0
563097013 563097 13 128618 2017-05-20 17:45:29 01 41.1781344 -8.5951169 41.1782919 -8.5950689 0
563097011 563097 11 128618 2017-05-20 17:43:41 01 41.1781196 -8.5954075 41.1782139 -8.5950689 0
563097009 563097 9 128618 2017-05-20 17:41:48 01 41.1782497 -8.595197 41.1781101 -8.5954124 0
563097003 563097 3 128618 2017-05-20 17:10:29 01 41.1832512 -8.6081606 41.1782561 -8.5950259 0
In SQL, to filter unique segments (segment_id
) by travelmode
I will do:
SELECT travelmode, COUNT(DISTINCT segment_id) AS NumOfSegments
FROM df_table
GROUP BY travelmode
What is the pandas equivalent of this expression?
CodePudding user response:
Maybe:
df.groupby('travelmode').segment_id.nunique()
as suggested in this post.