Home > Blockchain >  Transforming Data using Power Query to grouped the same value
Transforming Data using Power Query to grouped the same value

Time:03-26

Can someone tell me how I can transform my data from Table 1 to Table 2 format using Power Query in Excel?

enter image description here

CodePudding user response:

  • Group by Customer
    • Aggregate by creating a list of Records for each Customer
  • Expand the record list

Code edited to allow for varying numbers of vaccination dates

let

//change table name to your actual table name
    Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Customer Name", type text}, {"Vaccination Date", type date}}),

//group by Customer, then create expandable record for the vaccination dates
    group = Table.Group(#"Changed Type","Customer Name",{
        {"Vaccination Date", (t)=> Record.FromList(t[Vaccination Date], 
            List.Transform({1..Table.RowCount(t)}, each "Vaccination Date " & Text.From(_)))}
    }),

//Get Record Field Names, to allow for changing number of vaccination dates
// will need to make changes if there are more than nine (9) dates
    rfn = List.Sort(List.Distinct(List.Accumulate(group[Vaccination Date],{}, (state, current)=> 
        state & Record.FieldNames(current)))),

//Expand the recordd
    #"Expanded Vaccination Date" = Table.ExpandRecordColumn(group, "Vaccination Date", 
        rfn),

//Set the data types to date
    typeDates = Table.TransformColumnTypes(#"Expanded Vaccination Date",
        List.Transform(rfn, each {_, type date}))
    
in
    typeDates

enter image description here

CodePudding user response:

I will try with this sample data: enter image description here

Here is Power Query script:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wci4tLsnPTS1ScFTSUTLUN9Q3MjAwVIrVQZMxgsgYo8o4IWQMMWWM4XpiAQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Customer Name" = _t, #"Vaccination Date" = _t]),
    #"Grouped Rows" = Table.Group(Source, {"Customer Name"}, {{"Count", each Table.RowCount(_), Int64.Type}, {"p", each Table.AddIndexColumn(_, "CountMultiple",1,1), type table}}),
    #"Expanded p" = Table.ExpandTableColumn(#"Grouped Rows", "p", {"Vaccination Date", "CountMultiple"}, {"p.Vaccination Date", "p.CountMultiple"}),
    #"Added Prefix" = Table.TransformColumns(#"Expanded p", {{"p.CountMultiple", each "Vaccination Date " & Text.From(_, "en-GB"), type text}}),
    #"Pivoted Column" = Table.Pivot(#"Added Prefix", List.Distinct(#"Added Prefix"[p.CountMultiple]), "p.CountMultiple", "p.Vaccination Date"),
    #"Removed Columns" = Table.RemoveColumns(#"Pivoted Column",{"Count"})
in
    #"Removed Columns"

and the key thing here is this custom 'group by' add index (set serial) inside a group in 'group by' (here is 'Customer Name'):

{"p", each Table.AddIndexColumn(_, "CountMultiple",1,1), type table}

This way can really help in case your 'Customer Name' has multiple 'Vaccination Date'(s) other than 2...

enter image description here

(reference: how-to-find-duplicate-numbers-in-power-query-and-assign-them)

  • Related