I'm working on a visualization project at work - it will display which of our partners sells the highest amount of merchandise per month from Jan to Dec last year. Currently, our company's business intelligence tool outputs the following
Ship State Month Partner Net Demand
AK Jan Coca-Cola 3200
AK Jan Wegman's 2800
AK Jan Target 2000
...
AL Jan Coca-Cola 2600
AL Jan Walmart 2100
AL Jan Target 700
...
WY Jan Wegmans 1600
WY Jan Cola 200
WY Jan Target 98
...
AK Feb Coca-Cola 3100
AK Feb Wegman's 2700
AK Feb Target 1600
...
And so on. This is the current output - below is my desired output...
Ship State Month Partner Net Demand
AK Jan Coca-Cola 3200
AL Jan Coca-Cola 2600
...
WY Jan Wegmans 1600
...
AK Feb Coca-Cola 3100
...
If this makes sense. I only want the first ranked partner by top net demand. I don't want all of the others outputting. My company's BI tool is very limited. I've tried a pivot table in Excel, which gets me the ranks in a pivot table format, but when I copy/paste, the values all output wrongly. I filter by month and then state, so it's a vertical breakdown, I can't get all the values next to each other to feed into R. When I copy/paste, it looks like...
Jan
AK
Coca Cola 3200
AL
Coca Cola 2600
... etc
Is there any way to efficiently output and organize this? I need to do each partner for each state and intensive cleanup would take a lot of time because we have a lot of partners I'd have to delete and what not. I just want to output max rank per state per month - which is all the US states over a 12 month period.
CodePudding user response:
This can be accomplished using Power Query, available in Windows Excel 2010 and Excel 365 (Windows or Mac)
To use Power Query
- Select some cell in your Data Table
Data => Get&Transform => from Table/Range
orfrom within sheet
- When the PQ Editor opens:
Home => Advanced Editor
- Make note of the Table Name in Line 2
- Paste the M Code below in place of what you see
- Change the Table name in line 2 back to what was generated originally.
- Read the comments and explore the
Applied Steps
to understand the algorithm
M Code
let
//Change next line to return your actual table
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
//set data types
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Ship State", type text}, {"Month", type text}, {"Partner", type text}, {"Net Demand", Int64.Type}}),
//Remove rows containing "..."
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Ship State] <> "...")),
//group by State and Month
//then filter each subtable for the maximum Net Demand
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"Ship State", "Month"}, {
{"All", (t)=> Table.SelectRows(t, each [Net Demand] = List.Max(t[Net Demand])),
type table [Ship State=nullable text, Month=nullable text, Partner=nullable text, Net Demand=nullable number]}}),
//Re-exand the relevant table columns
#"Expanded All" = Table.ExpandTableColumn(#"Grouped Rows", "All", {"Partner", "Net Demand"})
in
#"Expanded All"