Does anybody know how I can use Power Query to group sensor events by the gap between the previous datetime reading and the current one? Apologies, I wasn't sure how to phrase this question, which is probably why I'm having no luck with Google.
What I want to do is group some data by start and end datetimes based on the gap between the current reading and the previous reading.
If you consider the data below:
It's easy to group it like so:
But this would actually be two discrete events, not one - if you notice the gap between 10:00 and 13:00, I'm assuming that the events need to be grouped within two hours of each other, like:
Edit - So, to be clear, if a reading happens more than two hours after the last then it can be considered a new event.
Is an efficient solution to this possible within Power Query? As you can probably imagine when it comes to datetime readings and sensors, there are a tonne of data.
I could and would (and did) write code to do this in other situations, but this report already groups by the min and max datetime via Power Query, and so it would be much less work if I can just add the additional time cut-off group logic there.
Thanks for your help!
Phil.
Edit - I've seen a solution where you add an index and subtract the date to create island groups for dates (not datetimes), but I don't think (although I could be wrong) that this would work here as the readings don't come through at exactly regular intervals (they might be every 1 hr 2 mins or every 58 mins etc.).
CodePudding user response:
See if this helps. Yes, I am creating island groups using an index
Merge row above each row into adjacent column
Take the duration between the two columns
Add index
If duration > 2 then put in the index
Fill down
Group on index and take min/max
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
// stolen from Ron Rosenfeld, great way to combine list next to itself, offset
shiftedList = {null} & List.RemoveLastN(Source[Reading Datetime],1),
custom1 = Table.ToColumns(Source) & {shiftedList},
custom2 = Table.FromColumns(custom1,Table.ColumnNames(Source) & {"Previous Row"}),
#"Added Index" = Table.AddIndexColumn(custom2, "Index", 0, 1),
// copy over index when duration > 2 hours between columns
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each if [Previous Row]=null then [Index] else if Number.From([Reading Datetime]-[Previous Row])*24 >2 then [Index] else null),
#"Filled Down" = Table.FillDown(#"Added Custom",{"Custom"}),
#"Grouped Rows" = Table.Group(#"Filled Down", {"Custom"}, {{"Reading Start Datetime", each List.Min([Reading Datetime]), type datetime}, {"Reading End Datetime", each List.Max([Reading Datetime]), type datetime}}),
#"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"Custom"})
in #"Removed Columns"