I receive a daily export of data every day I load into my excel sheet via Power Query. The table of data I can't control is:
tblExport
Name | Company | States |
---|---|---|
Jane Doe | ABC | AK,AL,GA,WA |
John Smith | ACME | AK,GA,FL,WA |
I need to replace those State Abbreviations with a technology string of information for this question I'll use "Full State Name" as a substitute. So basically it checks the COMPANY field against another table as the "technology Strings" will be different for each Company per State.
So far so good, or so I thought. Then I split delimiters of tblExport.States BY "," which then I get
Name | Company | States.1 | States.2 | States.3 | States.4 |
---|---|---|---|---|---|
Jane Doe | ABC | AK | AL | GA | WA |
John Smith | ACME | AK | GA | FL | WA |
Now we reference that table that contains the Company, State, FullStateNames
tblStateNames
COMPANY | Abbr | State Name |
---|---|---|
ABC | AL | AlabamaABC |
ABC | AK | AlaskaABC |
ACME | AK | AlaskaACME |
ACME | GA | GeorgiaACME |
ABC | FL | FloridaABC |
ABC | WA | WashingtonABC |
ACME | WA | WashingtonACME |
ST01 = Table.NestedJoin(#"Changed Type1", {"States.1", "Company"},
tblStateNames, {"Abbr", "Company"}, "tblStateNames",
JoinKind.LeftOuter),
ExpST01 = Table.ExpandTableColumn(ST01, "tblStateNames", {"State
Name"}, {"tblStateNames.State Name"}),
Which works great until I meet a condition such as Company ABC has GA in the TblExport.States, but they do not qualify for GA. So when it joins the query tblStateNames and ABC doesn't match for GA it returns a null value.
So my column output is
Name | Company | ST01 | ST02 | ST03 | ST04 |
---|---|---|---|---|---|
Jane Doe | ABC | AlaskaABC | AlabamaABC | null | WashingtonABC |
John Smith | ACME | AlaskaACME | GeorgiaACME | FloridaACME | WashingtonACME |
A couple things about this. The original TblExport is a daily intake and the people range for their states, some will have ZERO and the rest can be anywhere from 1 to 40 states. The challenge and why this is semi the issue is because I can't have any gaps in the columns. So while ST03 displays null as it should I rather have it fill ST04 into the ST03 column.
Name | Company | ST01 | ST02 | ST03 | ST04 |
---|---|---|---|---|---|
Jane Doe | ABC | AlaskaABC | null | null | WashingtonABC |
John Smith | ACME | AlaskaACME | GeorgiaACME | FloridaACME | WashingtonACME |
Now after the fact I could do a conditional IF ST02 is not equal <> null then ST02 else ST03. However in this example that null simply moves from ST03 to ST02. However, this only moves the next one down a column, so a double null will still lead to an issue.
In my very new to PQ head, I think I somehow need to validate the states in the original delimited field before doing the query lookup?
I know I've probably overly complicated things and masking the actual code for internal only reasons, takes a bit longer for me to try to explain. :)
I appreciate any input, when responding, try to keep in mind my experience level. Experience Level: I'm a Highly Functioning Idiot.
Patrick
CodePudding user response:
If I understand, here is one way to do it:
- Read in the two tables
- split the Export table state abbreviations into ROWS
- Join with the StateName Table
- Group by Name and Company
- Extract a delimited list of the state names from each subtable
- Expand that list
Please read the code comments and explore the Applied Steps to better understand what is going on
let
//Read in the two tables
Source = Excel.CurrentWorkbook(){[Name="tblStateNames"]}[Content],
tblStateNames = Table.TransformColumnTypes(Source, List.Transform(Table.ColumnNames(Source), each {_, type text})),
Source2 = Excel.CurrentWorkbook(){[Name="tblExport"]}[Content],
tblExport = Table.TransformColumnTypes(Source2, List.Transform(Table.ColumnNames(Source2), each {_, type text})),
//split the States column by comma into Rows
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(tblExport, {
{"States", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv),
let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "States"),
//join with the States Names
joined = Table.Join(#"Split Column by Delimiter",{"States","Company"},tblStateNames,{"Abbr","COMPANY"}, JoinKind.LeftOuter),
#"Removed Columns" = Table.RemoveColumns(joined,{"States", "COMPANY", "Abbr"}),
#"Grouped Rows" = Table.Group(#"Removed Columns", {"Name","Company"}, {
{"States", each Text.Combine([State Name],";"),type text}}),
//split with no column Count option
#"Split Column by Delimiter1" = Table.SplitColumn(#"Grouped Rows", "States", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv))
in
#"Split Column by Delimiter1"