Home > Back-end >  Merging 2 datasets using sql?
Merging 2 datasets using sql?

Time:10-11

Im doing a data management on a dataset to train my self. I have two tables the first one is called "Suivi" and the second one "Treatement", i wanna merge the next 2 tables in way to get the third one.

Suivi Table:

Suivi_id member_id rhinitis_symptoms medicin_id1 medicin_id2 medicin_id3 medicin_id4 medicin_id5
3357195 85763 51
3357665 87339 8
3358237 87343 17
3358319 87345 26
3358893 87343 0
3359371 87343 0
3360505 87343 30
3362165 87345 4
3362607 87343 0
3362613 87505 64

Treatment :

Treatement_id suivi_data_id medicin_id
927231 3357195 7501
927765 3357665 7489
928423 3358237 7489
928425 3358237 7501
928427 3358237 7623
928539 3358319 7489
929173 3358893 7489
929741 3359371 7489
931089 3360505 7489
932937 3362165 7489
933381 3362607 7569
933383 3362607 7645
933385 3362607 7789
933387 3362607 19159
933393 3362613 7569
933395 3362613 7605
933397 3362613 7645
933399 3362613 7789
933401 3362613 19159

The wanted table :

Suivi_id member_id rhinitis_symptoms medicin_id1 medicin_id2 medicin_id3 medicin_id4 medicin_id5
3357195 85763 51 7501
3357665 87339 8 7489
3358237 87343 17 7489 7501 7623
3358319 87345 26 7489
3358893 87343 0 7489
3359371 87343 0 7489
3360505 87343 30 7489
3362165 87345 4 7489
3362607 87343 0 7569 7645 7789 19159
3362613 87505 64 7569 7605 7645 7789 19159

Is there a way to do it in sql (in access ?!) or by doing a function in R or python.

Thanks in advance

CodePudding user response:

What you're looking for is a join.

Something like:

SELECT suivi.Suivi_id, suivi.member_id, treatment.medicin_id ...
FROM suivi
JOIN treatment ON suivi.Suivi_id = treatment.suivi_data_id

CodePudding user response:

I know two ways. Either with the Pivot Function(https://learn.microsoft.com/de-de/sql/t-sql/queries/from-using-pivot-and-unpivot?view=sql-server-ver16) or rank your data with the ROW_NUMMBER() function in mssql. (Other dbs should have similar functions.) It is also possible to write something dynamic, but here the staitc version:

WITH TreatementRowNumber as (
SELECT 
ROW_NUMBER() OVER (PARTITION BY [suivi_data_id]  ORDER BY [suivi_data_id])     RowNumber
, suivi_data_id
, medicin_id
FROM Treatement

)

Select
[Suivi_id]
, [member_id]
, [rhinitis_symptoms]
, (SELECT Medicin_id FROM TreatementRowNumber WHERE Suivi.Suivi_id = TreatementRowNumber.suivi_data_id AND RowNumber = 1) medicin_id1
, (SELECT Medicin_id FROM TreatementRowNumber WHERE Suivi.Suivi_id = TreatementRowNumber.suivi_data_id AND RowNumber = 2) medicin_id2
, (SELECT Medicin_id FROM TreatementRowNumber WHERE Suivi.Suivi_id = TreatementRowNumber.suivi_data_id AND RowNumber = 3) medicin_id3
, (SELECT Medicin_id FROM TreatementRowNumber WHERE Suivi.Suivi_id = TreatementRowNumber.suivi_data_id AND RowNumber = 4) medicin_id4
, (SELECT Medicin_id FROM TreatementRowNumber WHERE Suivi.Suivi_id = TreatementRowNumber.suivi_data_id AND RowNumber = 5) medicin_id5
 
FROM Suivi
  • Related