I am trying to upload couple panel dataset to MySQL server. But I have couple issues when it came to schema design.
How to create a relational schema and combine these 2 tables together in MySQL database?
Do I need Star schema or Snowflake schema?
Dataset (1): account profile table, it has multiple client's info in daily granularity (365 days in total). Example shown below, 2 different accounts (HP and KA) with client income, age and gender. File date is the client account status date.
Account ID | Income | Age | Gender | File Date |
---|---|---|---|---|
HP | 10,000 | 40 | Male | 2019-04-01 |
HP | 10,000 | 40 | Male | 2019-04-02 |
HP | 10,000 | 40 | Male | 2019-04-03 |
HP | 12,000 | 40 | Male | 2019-04-04 |
KA | 12,000 | 23 | Female | 2019-04-01 |
KA | 12,000 | 23 | Female | 2019-04-02 |
KA | 12,000 | 23 | Female | 2019-04-03 |
KA | 12,000 | 23 | Female | 2019-04-04 |
Dataset (2): account trading table, it has multiple client's info in daily granularity. Example shown below, first row says account HP bought google stock on 2019-06-12 for 500.00 Dollars.
Account ID | Stock ID | Trade Type | Trade Amount | File Date |
---|---|---|---|---|
HP | GOOG | Buy | 500.0 | 2019-06-12 |
HP | APPL | Sell | 600.0 | 2020-03-23 |
KA | AMZN | Sell | 1000.0 | 2020-07-23 |
KA | APPL | Sell | 353.0 | 2020-10-13 |
KA | MSFT | Buy | 400.0 | 2021-02-03 |
CodePudding user response:
I didn't get an explanation, so I'm assuming that income is a synonym for balance and not additional money each day.
I'd start with an Account table
Account
-------
Account ID (PK)
Gender
Birthdate
And an Account Balance table
Account Balance
---------------
Account ID
Balance Date
Balance
Where the primary key is (Account ID, Balance Date descending).
Finally, we have a Transaction table
Transaction
-----------
Transaction ID (PK)
Account ID
Transaction Date
Stock ID
Trade Type
Trade Amount
With a unique index on (Account ID, Transaction Date descending)