Home > Software engineering >  Power Query Formula Language - Get children based on parent adjacent column value
Power Query Formula Language - Get children based on parent adjacent column value

Time:10-10

bear with me, this is my first attempt using the Power Query Formula Language. I need some advice on how to solve a particular problem sorting and filtering source data.

I now got this current source data, structured like this:

enter image description here

Using this power query:

let
    Source = Excel.CurrentWorkbook(){[Name="EmployeeOrganization"]}[Content],
    ListEmployees = Table.Group(Source, {"Organization"}, {{"Employee", each Text.Combine([Employee],","), type text}}),
    CountEmployees = Table.AddColumn(ListEmployees, "Count", each List.Count(Text.Split([Employee],","))),
    SplitEmployees = Table.SplitColumn(ListEmployees, "Employee", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv),List.Max(CountEmployees[Count])),
    Transpose = Table.Transpose(SplitEmployees),
    PromoteHeaders = Table.PromoteHeaders(Transpose, [PromoteAllScalars=true])
in
    PromoteHeaders

I am able to produce the following result:

enter image description here

To avoid having to add the organization name to every single employee in the source, I would like the organization name to act as an parent-group, with the employees as children. I would also like the result to only fetch the organizations ( employees) that has status Active = Yes.

The desired source should look similar to this:

enter image description here

So that the desired result should look similar to this: (Apple is gone due to Active = NO)

enter image description here

I am stuck at this point and need some advice on how can I modify my Power Query Formula to:

  1. Only fetch Organizations that are Active (Does not matter if they have employees or not)
  2. Somehow link the children Employees to the correct Organizations. (Without having to write the org name in every adjacent employee column)

(Excel file can be found enter image description here

Custom Function
Rename: fnPivotAll

//credit: Cam Wallace  https://www.dingbatdata.com/2018/03/08/non-aggregate-pivot-with-multiple-rows-in-powerquery/

(Source as table,
    ColToPivot as text,
    ColForValues as text)=> 

let
     PivotColNames = List.Buffer(List.Distinct(Table.Column(Source,ColToPivot))),
     #"Pivoted Column" = Table.Pivot(Source, PivotColNames, ColToPivot, ColForValues, each _),
 
    TableFromRecordOfLists = (rec as record, fieldnames as list) =>
    
    let
        PartialRecord = Record.SelectFields(rec,fieldnames),
        RecordToList = Record.ToList(PartialRecord),
        Table = Table.FromColumns(RecordToList,fieldnames)
    in
        Table,
 
    #"Added Custom" = Table.AddColumn(#"Pivoted Column", "Values", each TableFromRecordOfLists(_,PivotColNames)),
    #"Removed Other Columns" = Table.RemoveColumns(#"Added Custom",PivotColNames),
    #"Expanded Values" = Table.ExpandTableColumn(#"Removed Other Columns", "Values", PivotColNames)
in
    #"Expanded Values"

Basic Query

let

//Read in data and set data types
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8k12yc9LzEkpVtJRAqLI1GKlWJ1oEDMgtSS1CCQK5XvlpyLzEvPgXMeCgpxUiH6/fJgC38SiSiT1jjmZyXAN7vn56TAdyDYmluYgaXHKTwLzYgE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Organization = _t, Employee = _t, Active = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Organization", type text}, {"Employee", type text}, {"Active", type text}}),

//replace blanks with null if not already there
    #"Replaced Value" = Table.ReplaceValue(#"Changed Type","",null,Replacer.ReplaceValue,{"Organization", "Employee", "Active"}),

//fill down the Company and active columns    
    #"Filled Down" = Table.FillDown(#"Replaced Value",{"Organization", "Active"}),

//Filter to show only Active="Yes and Employee not null
    #"Filtered Rows" = Table.SelectRows(#"Filled Down", each ([Employee] <> null) and ([Active] = "Yes")),

//Pivot with no aggregation
//could do this with grouping, but easier (and maybe faster, with a custom function
    pivotAll = fnPivotAll(#"Filtered Rows","Organization","Employee"),

//remove unneeded Active column and set data types
    #"Removed Columns" = Table.RemoveColumns(pivotAll,{"Active"}),
    typed = Table.TransformColumnTypes(#"Removed Columns",
        List.Transform(Table.ColumnNames(#"Removed Columns"),each {_, Text.Type}))

in
    typed

typed Results
enter image description here

  • Related