I have a list of data in which I want to return a 1 in a different column when each value appears for the first time. I can do this in an Excel table with a formula '=(COUNTIF($A$2:$A2,$A2)=1) 0', but I can't figure out how to do this in Power Query. An example can be seen here:
User ID | First instance? |
---|---|
123 | 1 |
123 | 0 |
123 | 0 |
234 | 1 |
345 | 1 |
345 | 0 |
456 | 1 |
567 | 1 |
567 | 0 |
567 | 0 |
567 | 0 |
In my table, the User IDs obviously aren't in number order, but this example should show what I'm aiming to do.
I have all of the table data in Power Query, and this is the last thing I can't figure out how to do. I am new to Power Query, and have searched around for a solution, but I haven't been able to find one so any help would be much appreciated. The data isn't as simple as the table. There are lots more columns with different data, but this is the column I am looking at. Thanks!
CodePudding user response:
One way in powerquery is a shifted column which is much faster than an index for large data sets
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
// create shifted column
shiftedList = {null} & List.RemoveLastN(Source[User ID],1),
custom1 = Table.ToColumns(Source) & {shiftedList},
custom2 = Table.FromColumns(custom1,Table.ColumnNames(Source) & {"Next Row"}),
#"Added Custom" = Table.AddColumn(custom2, "First Instance?", each if [User ID]=[Next Row] then 0 else 1),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Next Row"})
in #"Removed Columns"
Or you can use a grouping
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Grouped Rows" = Table.Group(Source, {"User ID"}, {{"data", each Table.RemoveColumns(Table.AddColumn(Table.AddIndexColumn(_, "Index", 0, 1, Int64.Type), "First Instance?", each if [Index]=0 then 1 else 0),{"Index"}), type table }}),
#"Expanded data" = Table.ExpandTableColumn(#"Grouped Rows", "data", {"First Instance?"}, {"First Instance?"})
in #"Expanded data"
CodePudding user response:
Right, one idea could be to use an index-column on your presumably sorted User-ID's column:
M-code:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"User ID", Int64.Type}}),
Sorted = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(Sorted, "First_Instance", each if [Index]>0 then if Sorted{[Index] - 1}[User ID] = [User ID] then 0 else 1 else 1),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Index"})
in
#"Removed Columns"
CodePudding user response:
Thanks horseyride, I used the grouping option and it worked perfectly.