I have these two tables that I need to join side by side
Table A
id | date |
---|---|
1 | 03/01/2021 |
1 | 04/01/2021 |
1 | 05/01/2021 |
2 | 04/01/2021 |
2 | 05/01/2021 |
3 | 03/01/2021 |
3 | 04/01/2021 |
Table B
id | date |
---|---|
1 | 03/01/2021 |
1 | 04/01/2021 |
1 | 05/01/2021 |
1 | 06/01/2021 |
2 | 04/02/2021 |
2 | 05/02/2021 |
3 | 03/01/2021 |
The output would be
id | dateA | dateB |
---|---|---|
1 | 03/01/2021 | 03/01/2021 |
1 | 04/01/2021 | 04/01/2021 |
1 | 05/01/2021 | 05/01/2021 |
1 | 06/01/2021 | |
2 | 04/01/2021 | 04/02/2021 |
2 | 05/01/2021 | 05/02/2021 |
3 | 03/01/2021 | 03/01/2021 |
3 | 04/01/2021 |
Basically, search all records that match a value, (for example 1, then list them side by side)
I tried joining them using id as key but it spawned a multitude of other rows that I dont want. Tried grouping as well but it messes with the order
I’m using sqlite via pandas
The quey below causes some extra rows which i cant figure out how to filter out
SELECT A.id, A.date, B.date
FROM A
JOIN B
ON B.id = A.id
Adding a group by causes the table to output only the first records of each multiple
CodePudding user response:
Indeed you need a FULL JOIN
which doesn't exist within SQLite, then consider using such a simulaton through use of only provided OUTER JOIN
, which is unlike to the current INNER JOIN
,type which is LEFT JOIN
as
SELECT B.id, A.date, B.date
FROM B
LEFT JOIN A
ON B.id = A.id
AND B.date = A.date
UNION
SELECT A.id, A.date, B.date
FROM A
LEFT JOIN B
ON B.id = A.id
AND B.date = A.date
CodePudding user response:
Use a CTE where you rank all the rows of both tables by id and order of the dates and then aggregate:
WITH cte AS (
SELECT id, date dateA, null dateB, ROW_NUMBER() OVER (PARTITION BY id ORDER BY date) rn
FROM TableA
UNION ALL
SELECT id, null, date, ROW_NUMBER() OVER (PARTITION BY id ORDER BY date) rn
FROM TableB
)
SELECT id, MAX(dateA) dateA, MAX(dateB) dateB
FROM cte
GROUP BY id, rn
ORDER BY id, rn;
See the demo.
Note that your dates as they are in the format dd/mm/yyyy
, they are not comparable.
You should change them to yyyy-mm-dd
for the code to work properly.