Home > other >  Conditional Columns and Delimited Fields
Conditional Columns and Delimited Fields

Time:03-21

I receive a daily export of data every day I load into my excel sheet via Power Query. The table of data I can't control is:

tblExport

Name Company States
Jane Doe ABC AK,AL,GA,WA
John Smith ACME AK,GA,FL,WA

I need to replace those State Abbreviations with a technology string of information for this question I'll use "Full State Name" as a substitute. So basically it checks the COMPANY field against another table as the "technology Strings" will be different for each Company per State.

So far so good, or so I thought. Then I split delimiters of tblExport.States BY "," which then I get

Name Company States.1 States.2 States.3 States.4
Jane Doe ABC AK AL GA WA
John Smith ACME AK GA FL WA

Now we reference that table that contains the Company, State, FullStateNames

tblStateNames

COMPANY Abbr State Name
ABC AL AlabamaABC
ABC AK AlaskaABC
ACME AK AlaskaACME
ACME GA GeorgiaACME
ABC FL FloridaABC
ABC WA WashingtonABC
ACME WA WashingtonACME
        ST01 = Table.NestedJoin(#"Changed Type1", {"States.1", "Company"}, 
        tblStateNames, {"Abbr", "Company"}, "tblStateNames", 
        JoinKind.LeftOuter),
        ExpST01 = Table.ExpandTableColumn(ST01, "tblStateNames", {"State 
        Name"}, {"tblStateNames.State Name"}),

Which works great until I meet a condition such as Company ABC has GA in the TblExport.States, but they do not qualify for GA. So when it joins the query tblStateNames and ABC doesn't match for GA it returns a null value.

So my column output is

Name Company ST01 ST02 ST03 ST04
Jane Doe ABC AlaskaABC AlabamaABC null WashingtonABC
John Smith ACME AlaskaACME GeorgiaACME FloridaACME WashingtonACME

A couple things about this. The original TblExport is a daily intake and the people range for their states, some will have ZERO and the rest can be anywhere from 1 to 40 states. The challenge and why this is semi the issue is because I can't have any gaps in the columns. So while ST03 displays null as it should I rather have it fill ST04 into the ST03 column.

Name Company ST01 ST02 ST03 ST04
Jane Doe ABC AlaskaABC null null WashingtonABC
John Smith ACME AlaskaACME GeorgiaACME FloridaACME WashingtonACME

Now after the fact I could do a conditional IF ST02 is not equal <> null then ST02 else ST03. However in this example that null simply moves from ST03 to ST02. However, this only moves the next one down a column, so a double null will still lead to an issue.

In my very new to PQ head, I think I somehow need to validate the states in the original delimited field before doing the query lookup?

I know I've probably overly complicated things and masking the actual code for internal only reasons, takes a bit longer for me to try to explain. :)

I appreciate any input, when responding, try to keep in mind my experience level. Experience Level: I'm a Highly Functioning Idiot.

Patrick

CodePudding user response:

If I understand, here is one way to do it:

  • Read in the two tables
  • split the Export table state abbreviations into ROWS
  • Join with the StateName Table
  • Group by Name and Company
  • Extract a delimited list of the state names from each subtable
  • Expand that list

Please read the code comments and explore the Applied Steps to better understand what is going on

let

//Read in the two tables
    Source = Excel.CurrentWorkbook(){[Name="tblStateNames"]}[Content],
    tblStateNames = Table.TransformColumnTypes(Source, List.Transform(Table.ColumnNames(Source), each {_, type text})),

    Source2 = Excel.CurrentWorkbook(){[Name="tblExport"]}[Content],
    tblExport = Table.TransformColumnTypes(Source2, List.Transform(Table.ColumnNames(Source2), each {_, type text})),

//split the States column by comma into Rows
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(tblExport, {
        {"States", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), 
            let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "States"),

//join with the States Names
    joined = Table.Join(#"Split Column by Delimiter",{"States","Company"},tblStateNames,{"Abbr","COMPANY"}, JoinKind.LeftOuter),
    #"Removed Columns" = Table.RemoveColumns(joined,{"States", "COMPANY", "Abbr"}),
    #"Grouped Rows" = Table.Group(#"Removed Columns", {"Name","Company"}, {
        {"States", each Text.Combine([State Name],";"),type text}}),

//split with no column Count option
    #"Split Column by Delimiter1" = Table.SplitColumn(#"Grouped Rows", "States", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv))
in
    #"Split Column by Delimiter1"

enter image description here

  • Related