I'm working to analyze a Status Change over time. I have a large table in Excel as:
ID | Date | Status |
---|---|---|
01 | Aug-01 | Pending |
01 | Aug-02 | Pending |
01 | Aug-03 | Pending |
02 | Aug-01 | Pending |
02 | Aug-02 | Pending |
02 | Aug-03 | Assigned |
There are thousands of rows of source data... I am only looking at data from the past 7 days. Essentially I'm looking for change activity since the last status report.
I use Power query to read the table and then pivot the data so I get the following results:
ID | Aug-01 | Aug-02 | Aug-03 |
---|---|---|---|
01 | Pending | Pending | Pending |
02 | Pending | Pending | Assigned |
I only expect a dozen or so rows (ID's) to be reported each week with an ever changing set of pivoted date columns as we progress through the year.
I want to get rid of the rows of data where each column is exactly the same... Each day has a full set of data including historically closed items. I only want to see data where there is a change in the series from the original table.
I'd be happy to do it prior to the Pivot, honestly that may be the better approach, but I don't understand how to remove rows where based upon the ID and the Date where the Status doesn't change.
I'm currently pushing the data from the Pivot
Any suggestions would be so appreciated!
Thanks Rob
CodePudding user response:
Try this code, on the first table in powerquery. Pulls last 7 days. Pivots. Checks if the contents of all the columns are identical, then filters on that
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Date", type date}, {"Status", type text}}),
//dates from last seven days, to examine
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each [Date] >Date.AddDays(DateTime.Date(DateTime.LocalNow()),-7)),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Filtered Rows", {{"Date", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Filtered Rows", {{"Date", type text}}, "en-US")[Date]), "Date", "Status"),
coltocheck = List.RemoveFirstN(Table.ColumnNames(#"Pivoted Column"),1),
IDX = Table.AddIndexColumn(#"Pivoted Column","Index",0,1),
totals = Table.AddColumn(IDX, "Custom", each List.Count(List.Distinct(Record.ToList( Table.SelectColumns(IDX,coltocheck){[Index]}) ))),
#"Filtered Rows1" = Table.SelectRows(totals, each ([Custom] = 2)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows1",{"Index", "Custom"})
in #"Removed Columns"
Alternate version that uses grouping
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Date", type date}, {"Status", type text}}),
//dates from last seven days, to examine
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each [Date] >Date.AddDays(DateTime.Date(DateTime.LocalNow()),-7)),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"ID"}, {{"Count", each _, type table }, {"Unique", each List.Count(List.Distinct(_[Status]))}}),
#"Filtered Rows1" = Table.SelectRows(#"Grouped Rows", each ([Unique] = 2)),
#"Expanded Count" = Table.ExpandTableColumn(#"Filtered Rows1", "Count", {"Date", "Status"}, {"Date", "Status"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Count",{"Unique"}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Removed Columns", {{"Date", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Removed Columns", {{"Date", type text}}, "en-US")[Date]), "Date", "Status")
in #"Pivoted Column"