Home > Enterprise >  Inner join large database with duplicate values
Inner join large database with duplicate values

Time:09-12

I got very large database where I need to find records with timestamp column value older than 30 days but exclude rows where timestamp value within last 30 days. and join it with another table to display. Also please take note that DB storage is based on slow HDD.

User table

TIMESTAMP   USER
2022-09-10  BOB
2022-08-20  BOB
2022-08-15  SAL
2022-05-07  MIKE
2022-04-07  RON
2022-03-07  BOB
2022-02-07  SAL
2022-01-15  JAKE
    

Tag Table

USER    TAG
BOB     active
SAL     pending
MIKE    inactive
RON     active
JAKE    pending

I want display values from user table where timestamp value is older than 30 days but exclude the row if it's found within 30 days and join it with ta table TAG

output should be like this

USER    TAG
MIKE    inactive
RON     active
JAKE    pending 

what I've tried so far.

get the all the rows from USER table where TIMESTAMP within 30 days and save to user_active table

get all the rows from USER table where TIMESTAMP not within last 30 days and USER not in user_active table and save to user_inactive table

and try to join it as follows

SELECT U.USERNAME, U.TIMESTAMP, T.TAG
FROM 
  user_inactive  AS U 
RIGHT JOIN 
  TAG_TABLE AS T
ON
U.USERNAME=T.USERNAME
LIMIT 100;

CodePudding user response:

You could use :

select u.user,t.tag
from  user_tbl u 
inner join tag_tbl t on u.user=t.user
where  u.user not in ( select user
                         from user_tbl 
                         where `timestamp` >=(now() -INTERVAL 3 MONTH)
                       )  ;

https://dbfiddle.uk/xAYECjwU

The where u.user not in will filter all the users that have the timestamp between current time and 3 months interval. Alternatively, you could use exists.

  • Related