I need to clean up the rows of a df in order to calculate the time spent inside the building. Sometimes the reader has entered multiple entries or exits within a short space of time - obviously an error. The errors are not always duplicates, they may have a few seconds or minutes between them.
What is the most efficient way to clean this up before I can know how long was spent inside?
We can assume that entry and exit happens within one day. ie. no-one spends a night there.
Below is a sample of the dataframe.
Date | Type |
---|---|
2021-11-10 19:31:50 | Exit |
2021-11-10 19:31:50 | Exit |
2021-11-10 18:49:21 | Entry |
2021-11-09 20:14:21 | Exit |
2021-11-09 19:34:05 | Entry |
Edit:
Expected output would have clean/clear entry and exit times (let's say lasting more than 10 minutes inside?)
You cannot just delete called rows, let's say we don't know how many rows there are...
Date | Type |
---|---|
2021-11-10 19:31:50 | Exit |
2021-11-10 18:49:21 | Entry |
2021-11-09 20:14:21 | Exit |
2021-11-09 19:34:05 | Entry |
CodePudding user response: