Home > Back-end >  MySQL query for matching multiple rows with >= and <= conditions
MySQL query for matching multiple rows with >= and <= conditions

Time:04-29

Let me explain a little bit more in detail...

I have a mysql table like this:

| session_id | session_start       | session_end         |
|------------|---------------------|---------------------|
| 1          | 2016-01-01 01:00:00 | 2016-01-01 01:03:00 |
| 2          | 2016-01-01 03:00:00 | 2016-01-01 03:02:24 |
| 3          | 2016-01-01 03:35:00 | 2016-01-01 03:37:02 |
| ...        | ...                 | ...                 |

Where the time interval between session_start and session_end never overlaps for different rows. Each session refers to a chat conversation where messages are interchanged between two users.

I also have a mongoDB where, for each document, I store text messages with a date/timestamp of when it was received, call it event_date. But, the document doesn't tell which session the message belongs to. So, the only way we can match a mongoDB document to a session in the mysql table is if event_date is between session_start and session_end for some record in mysql and then retreive the session_id.

My current solution is running a query like this (I'm working in python with mysql library for this), in pseudo-code:

for event_date in all_event_dates:
    cursor.execute("SELECT session_id
         FROM session_table
         WHERE %s >= session_start
           AND %s <= session_end",
        (event_date, event_date))
    row = cursor.fetchone()

But this is a very slow process since the length of the list all_event_dates is in the order of 100k. Is there a more efficient query where I can run it for multiple event_date values at a time?

Thanks a lot in advance

CodePudding user response:

Load all the event dates into a temporary table, then join with that.

cursor.execute("CREATE TEMPORARY TABLE event_dates (event_date DATETIME)";
cursor.executemany("INSERT INTO event_dates VALUES (?)", [(date,) for date in all_event_dates])
connection.commit()
cursor.execute("""
    SELECT session_id 
    FROM session_table AS s
    JOIN event_dates AS e ON e.event_date BETWEEN s.session_start AND s.session_end""")
rows = cursor.fetchall()

CodePudding user response:

Because they don't overlap, do

SELECT session_id
     FROM session_table
     WHERE %s >= session_start
     ORDER BY session_start
     LIMIT 1

and have this on the table:

INDEX(session_start, session_id)
  • Related