Home > Software design >  JOIN doubles the result SQL
JOIN doubles the result SQL

Time:07-12

I have two tables

Table 1

Row| click_datetime | Date | Time | job_id | user_ip

1 | 2022-04-06 19:49:56 UTC | 2022-04-06 | 19:49:56 | 8.2246310708361155e 18 | 200.2.222.24

2 | 2022-04-06 19:47:31 UTC | 2022-04-06 | 19:47:31 | 3.0507603609338957e 18 | 200.2.222.24

3 | 2022-04-16 18:51:04 UTC | 2022-04-16 | 18:51:04 | 1.8549485175961231e 18 | 200.2.222.268

4 | 2022-04-1618:51:04 UTC | 2022-04-16 | 18:51:04 | 1.8549485175961231e 18 | 200.2.222.268

Table 2

Row | job_id | user_ip | partner | browser

1 | 1.8549485175961231e 18 | 200.2.222.268 | 402 | Opera

2 | 1.8549485175961231e 18 | 200.2.222.268 | 402 | Opera

3 | 3.0507603609338957e 18 | 200.2.222.24 | 412 | Chrome

4 | 8.2246310708361155e 18 | 200.2.222.24 | 412 | Chrome

I need all partners from table 2 that are in table 1, but the result is doubled and duplicates come out. I have used all kinds of JOIN'S and DISTINCT. But at the output, the join doubles the result

SELECT t1._click_datetime_, t2.partner, t2.user_ip 
FROM t1
LEFT JOIN t2
ON t1.user_ip=t2.user_ip AND t1.job_id=t2.job_id

the output is doubled

Row | click_datetime | partner | user_ip

1 | 2022-04-06 19:49:56 UTC | 412 | 200.2.222.24

2 | 2022-04-06 19:47:31 UTC | 412 | 200.2.222.24

3 | 2022-04-16 18:51:04 UTC | 402 | 200.2.222.268

4 | 2022-04-16 18:51:04 UTC | 402 | 200.2.222.268

5 | 2022-04-16 18:51:04 UTC | 402 | 200.2.222.268

6 | 2022-04-16 18:51:04 UTC | 402 | 200.2.222.268

CodePudding user response:

You should probably add 'DISTINCT' or 'GROUP BY' clause to your query.

CodePudding user response:

The join doesnt double the results, you think there are double because you dont have all the columns in the select.

In order to figure out what the issue is, Just change your SELECT clause to *:

SELECT * 
FROM t1
LEFT JOIN t2
ON t1.user_ip=t2.user_ip AND t1.job_id=t2.job_id

And you will see that there are no duplicates because all rows different.

If you want to keep only distinct values for the columns you selected, use DISTINCT or a GROUP BY

SELECT DISTINCT t1._click_datetime_, t2.partner, t2.user_ip 
FROM t1
LEFT JOIN t2
ON t1.user_ip=t2.user_ip AND t1.job_id=t2.job_id

or

SELECT t1._click_datetime_, t2.partner, t2.user_ip 
FROM t1
LEFT JOIN t2
ON t1.user_ip=t2.user_ip AND t1.job_id=t2.job_id
GROUP BY t1._click_datetime_, t2.partner, t2.user_ip
  • Related