Home > database >  Bigint vs varchar datatype. Should I match source system?
Bigint vs varchar datatype. Should I match source system?

Time:03-23

Unfortunately, I think this is an opinion question. But I'd like to know if there is a best practice.

Our datawarehouse uses a bigint as a primary key for an address table. The existing records all came from a source system that used a bigint, so it matched. We have a new source system that now uses a varchar value for the identity. It's still just a number, and doesn't have any collisions with the old system. And doesn't use leading zeros or anything. The data comes in as a json message and we parse out the information. so there isn't any joining between the systems. So should we change our DW key to be varchar to match, or say screw it and just convert to bigint on the import.

Also, the update won't be that risky or affect many other processes.

One other piece of information, old source system was websphere. New source is C4/SAP.

CodePudding user response:

Your data warehouse shouldn't be using the Business Keys as the primary identifier regardless of the data type. Data Warehouses should use Surrogate Keys that are unique to the Data Warehouse, so that you can handle situations where you need to have multiple instances of the same Dimensional data, want to keep a history of the Dimensional data, the source system re-uses a now deleted Business Key or any number of other instances where the original Business Key causes problems in your DWH.

Personally, I would implement Surrogate Keys for your existing DWH tables and then you will have no issues incorporating the new data alongside the old.

  • Related