To start off, I am creating a table schema in Power BI with the use of R to wrangle all of my data.
Here is a simple example of my issue. I have a table with ID numbers and several demographics related to each ID. In this case, you'll see 3 demographics related to each ID with a true/false for each demographic (in my work situation, I actually have 95 demographics).
ID | Female | Veteran | Government |
---|---|---|---|
1 | TRUE | FALSE | FALSE |
2 | FALSE | FALSE | TRUE |
3 | TRUE | TRUE | TRUE |
4 | FALSE | FALSE | FALSE |
So, let this be the fact table. A dimension table would look something like this:
Demographic | Key |
---|---|
Female | 10 |
Veteran | 11 |
Government | 12 |
I need to create a relationship between these two tables. I'll be using Power BI, so I can only use one direct relationship. The main purpose is to be able to create visualizations that will filter down on a user's selection. For example, if the user is interested in how many ID's are female and veteran, the graph would only show ID #3.
As is, the dimension table will not work because there are no keys in the fact table to connect the two. For it to work properly, I would need one, and only one, column in the fact table with a key that connects to the dimension table. That would look like this:
ID | Female | Veteran | Government | Key |
---|---|---|---|---|
1 | TRUE | FALSE | FALSE | 10 |
2 | FALSE | FALSE | TRUE | 12 |
3 | TRUE | TRUE | TRUE | 10, 11, 12 |
4 | FALSE | FALSE | FALSE |
This doesn't work because Power BI won't "search" for a key within the "Key" column. It can only have one key per row, not a set of keys, as far as I'm aware. I could potentially make keys that would be the combination of demographics. So, for ID #3, the key would be "10_11_12" and then have that exact key within the dimension table, too. But, as mentioned above, I have 95 demographic columns and that's a right massive mess.
I have also tried to make the initial fact table above long instead of wide:
ID | Demographic | Value | Key |
---|---|---|---|
1 | Female | True | 10 |
1 | Veteran | False | 11 |
1 | Government | False | 12 |
2 | Female | False | 10 |
2 | Veteran | False | 11 |
2 | Government | True | 12 |
3 | Female | True | 10 |
3 | Veteran | True | 11 |
3 | Government | True | 12 |
4 | Female | False | 10 |
4 | Veteran | False | 11 |
4 | Government | False | 12 |
However, Power BI will only aggregate the data. That is, for our example of female and veteran, the graph will show any ID that is female as well as any ID that is veteran. So, the result would show ID's #1 and #3, but it should only show #3 (I need female and veteran not female or veteran).
Any ideas of how to get a dimension table and fact table to work well together for my situation?
CodePudding user response:
If this is your Fact Table
(ID,Female, Veteran, Government)
Your dimensions would be
DimFemale, DimVeteran, DimGovernment, etc. Each Dimension would have two rows and probably only a single column. So you typically just don't use dimension tables when the you don't have any data other than the dimension key.
CodePudding user response:
You only need the first table, no need to complicate.
If the user wants to know the number of ID's that are female and veteran, you would have a slicer/filter for female and veteran and the results would be filtered accordingly.