I have 2 columns in the table and I want to search for duplicates for every row from first column in all column 2. So I want to find if every row in column 1 exist in all column 2. I specify that my table is big, have 3 millions rows. my table name is bigquerry and have 2 columns : id1, and id2
CodePudding user response:
Use a JOIN
to link the table back on itself, and JOIN
on the ID fields.
Assuming a simple table like this:
id | id1 | id2 |
---|---|---|
1 | A | B |
2 | C | D |
3 | E | F |
4 | G | A |
5 | H | C |
6 | I | A |
7 | J | A |
8 | K | C |
9 | L | F |
You can extract the duplicates with this query:
select a.id, a.id1, b.id, b.id2 from duplicates a join (select id, id2 from duplicates) b on a.id1 = b.id2;
Output:
id | id1 | id | id2 |
---|---|---|---|
1 | A | 4 | A |
1 | A | 6 | A |
1 | A | 7 | A |
2 | C | 5 | C |
2 | C | 8 | C |
Note Make sure you have indexed the id2
column or this could take a very long time.
Demo: https://www.db-fiddle.com/f/fbZiyxa7fSzZFDhm9BS32L/1
CodePudding user response:
Here you go
<?php
//set time limit to indefinite to parse the 3m rows
set_time_limit(0);
$conn = new mysqli("localhost", "username", "password", "db_name");
$id1list = $conn->query("SELECT id1 FROM `bigquery`");
//loop through all id1 returned by the query
while($r = mysqli_fetch_assoc($id1list)) {
$id1 = $r['id1'];
//get the count of id2 which are equal to the current id1
$id2count = $conn->query("SELECT count(id2) FROM `bigquery` WHERE id2='$id1'");
$count = mysqli_fetch_array($id2count)[0];
//print if count is greater than 0 (exists in id2)
if($count>0){
echo "$id1 is repeated $count times. <br>";
}
}
?>