Home > OS >  Return a 1 for the first occurrence of each value in a column in Power Query, returning a 0 otherwis
Return a 1 for the first occurrence of each value in a column in Power Query, returning a 0 otherwis

Time:08-25

I have a list of data in which I want to return a 1 in a different column when each value appears for the first time. I can do this in an Excel table with a formula '=(COUNTIF($A$2:$A2,$A2)=1) 0', but I can't figure out how to do this in Power Query. An example can be seen here:

User ID First instance?
123 1
123 0
123 0
234 1
345 1
345 0
456 1
567 1
567 0
567 0
567 0

In my table, the User IDs obviously aren't in number order, but this example should show what I'm aiming to do.

I have all of the table data in Power Query, and this is the last thing I can't figure out how to do. I am new to Power Query, and have searched around for a solution, but I haven't been able to find one so any help would be much appreciated. The data isn't as simple as the table. There are lots more columns with different data, but this is the column I am looking at. Thanks!

CodePudding user response:

One way in powerquery is a shifted column which is much faster than an index for large data sets

let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
// create shifted column
    shiftedList = {null} & List.RemoveLastN(Source[User ID],1),
    custom1 = Table.ToColumns(Source) & {shiftedList},
    custom2 = Table.FromColumns(custom1,Table.ColumnNames(Source) & {"Next Row"}),
#"Added Custom" = Table.AddColumn(custom2, "First Instance?", each if [User ID]=[Next Row] then 0 else 1),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Next Row"})
in #"Removed Columns"

Or you can use a grouping

let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Grouped Rows" = Table.Group(Source, {"User ID"}, {{"data", each Table.RemoveColumns(Table.AddColumn(Table.AddIndexColumn(_, "Index", 0, 1, Int64.Type), "First Instance?", each if [Index]=0 then 1 else 0),{"Index"}), type table }}),
#"Expanded data" = Table.ExpandTableColumn(#"Grouped Rows", "data", {"First Instance?"}, {"First Instance?"})
in  #"Expanded data"

CodePudding user response:

Right, one idea could be to use an index-column on your presumably sorted User-ID's column:

enter image description here

M-code:

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"User ID", Int64.Type}}),
    Sorted = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1, Int64.Type),
    #"Added Custom" = Table.AddColumn(Sorted, "First_Instance", each if [Index]>0 then if Sorted{[Index] - 1}[User ID] = [User ID] then 0 else 1 else 1),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Index"})
in
    #"Removed Columns"

CodePudding user response:

Thanks horseyride, I used the grouping option and it worked perfectly.

  • Related