I am getting confused in the output of my table after performing a self join For eg : this my table
select * from Logins
id login_date
7 2020-05-30
1 2020-05-30
7 2020-05-31
7 2020-05-01
7 2020-05-02
7 2020-05-02
7 2020-05-03
1 2020-05-07
7 2020-05-10
and the output after running the query
select * from Logins a join Logins b on a.id = b.id
is :
id login_date id login_date
7 2020-05-30 7 2020-05-30
7 2020-05-31 7 2020-05-30
7 2020-05-01 7 2020-05-30
7 2020-05-02 7 2020-05-30
7 2020-05-02 7 2020-05-30
7 2020-05-03 7 2020-05-30
7 2020-05-10 7 2020-05-30
1 2020-05-30 1 2020-05-30
1 2020-05-07 1 2020-05-30
7 2020-05-30 7 2020-05-31
7 2020-05-31 7 2020-05-31
7 2020-05-01 7 2020-05-31
7 2020-05-02 7 2020-05-31
7 2020-05-02 7 2020-05-31
7 2020-05-03 7 2020-05-31
7 2020-05-10 7 2020-05-31
7 2020-05-30 7 2020-05-01
7 2020-05-31 7 2020-05-01
7 2020-05-01 7 2020-05-01
7 2020-05-02 7 2020-05-01
7 2020-05-02 7 2020-05-01
7 2020-05-03 7 2020-05-01
7 2020-05-10 7 2020-05-01
7 2020-05-30 7 2020-05-02
7 2020-05-31 7 2020-05-02
7 2020-05-01 7 2020-05-02
7 2020-05-02 7 2020-05-02
7 2020-05-02 7 2020-05-02
7 2020-05-03 7 2020-05-02
7 2020-05-10 7 2020-05-02
7 2020-05-30 7 2020-05-02
7 2020-05-31 7 2020-05-02
7 2020-05-01 7 2020-05-02
7 2020-05-02 7 2020-05-02
7 2020-05-02 7 2020-05-02
7 2020-05-03 7 2020-05-02
7 2020-05-10 7 2020-05-02
7 2020-05-30 7 2020-05-03
7 2020-05-31 7 2020-05-03
7 2020-05-01 7 2020-05-03
7 2020-05-02 7 2020-05-03
7 2020-05-02 7 2020-05-03
7 2020-05-03 7 2020-05-03
7 2020-05-10 7 2020-05-03
1 2020-05-30 1 2020-05-07
1 2020-05-07 1 2020-05-07
7 2020-05-30 7 2020-05-10
7 2020-05-31 7 2020-05-10
7 2020-05-01 7 2020-05-10
7 2020-05-02 7 2020-05-10
7 2020-05-02 7 2020-05-10
7 2020-05-03 7 2020-05-10
7 2020-05-10 7 2020-05-10
53 rows.
why is the self join giving the table b date value to every date value in a? shouldnt it simply be something like
id login_date id login_date
7 5/30/2020 7 5/30/2020
1 5/30/2020 1 5/30/2020
7 5/31/2020 7 5/31/2020
7 5/1/2020 7 5/1/2020
7 5/2/2020 7 5/2/2020
7 5/2/2020 7 5/2/2020
7 5/3/2020 7 5/3/2020
1 5/7/2020 1 5/7/2020
7 5/10/2020 7 5/10/2020
where table b is a replica or just another table like table a. I imagined self join as nothing but creating a table replica and joining it with itself.
I am just getting to know sql and this basic join function has got me confused or maybe its something very silly I am missing here. Please help.
CodePudding user response:
The IDs in the table aren't unique. So every row with some ID is matched to ALL the rows with that ID in the joined table. E.g., you have two rows with ID 1
in the table. Each of them is matched to each of them in the joined table for a total of 2*2=4 rows.
CodePudding user response:
This is how join works: for each row from first values all rows from second table with corresponding join key are returned. So, id join key is not unique, join duplicates records. You have 7
rows with id = 7
, it gives you 7*7=49
rows with id=7
and 2
rows with id=1
result in 2x2=4
rows with id=1
, totally 53
rows.
CodePudding user response:
To get this result:
id login_date id login_date
7 5/30/2020 7 5/30/2020
1 5/30/2020 1 5/30/2020
7 5/31/2020 7 5/31/2020
7 5/1/2020 7 5/1/2020
7 5/2/2020 7 5/2/2020
7 5/2/2020 7 5/2/2020
7 5/3/2020 7 5/3/2020
1 5/7/2020 1 5/7/2020
7 5/10/2020 7 5/10/2020
Try this code,
select * from Logins a
join Logins b
on a.id = b.id AND a.login_date = b.login_date.
In this query you will be comparing both the columns so that you will get the expected result.