Home > Mobile >  Multithreading for a socket connection in python
Multithreading for a socket connection in python

Time:11-24

I'm trying to scrape really hectic twitch chats for keywords but sometimes the socket stops for a split second, but in that split second, 5 messages can go by. I thought of implementing some multithreading but no luck in the code below. It seems like they all fail to catch a keyword, or all succeed. Any help is appreciated. Code below:

import os
import time
from dotenv import load_dotenv
import socket
import logging
from emoji import demojize

import threading

# loading environment variables
load_dotenv()

# variables for socket
server = "irc.chat.twitch.tv"
port = 6667
nickname = "frankied003"
token = os.getenv("TWITCH_TOKEN")
channel = "#xqcow"

# creating the socket and connecting
sock = socket.socket()
sock.connect((server, port))
sock.send(f"PASS {token}\n".encode("utf-8"))
sock.send(f"NICK {nickname}\n".encode("utf-8"))
sock.send(f"JOIN {channel}\n".encode("utf-8"))

while True:
    consoleInput = input(
        "Enter correct answer to the question (use a ',' for multiple answers):"
    )

    # if console input is stop, the code will stop ofcourse lol
    if consoleInput == "stop":
        break

    # make array of all the correct answers
    correctAnswers = consoleInput.split(",")
    correctAnswers = [answer.strip().lower() for answer in correctAnswers]

    def threadingFunction():

        correctAnswerFound = False

        # while the correct answer is not found, the chats will keep on printing
        while correctAnswerFound is not True:

            while True:
                try:
                    resp = sock.recv(2048).decode(
                        "utf-8"
                    )  # sometimes this fails, hence retry until it succeeds
                except:
                    continue
                break

            if resp.startswith("PING"):
                sock.send("PONG\n".encode("utf-8"))

            elif len(resp) > 0:
                username = resp.split(":")[1].split("!")[0]
                message = resp.split(":")[2]
                strippedMessage = " ".join(message.split())

                # once the answer is found, the chats will stop, correct answer is highlighted in green, and onto next question
                if str(strippedMessage).lower() in correctAnswers:
                    print(bcolors.OKGREEN   username   " - "   message   bcolors.ENDC)
                    correctAnswerFound = True
                else:
                    if username == nickname:
                        print(bcolors.OKCYAN   username   " - "   message   bcolors.ENDC)
                    # else:
                        # print(username   " - "   message)
    
    t1 = threading.Thread(target=threadingFunction)
    t2 = threading.Thread(target=threadingFunction)
    t3 = threading.Thread(target=threadingFunction)

    t1.start()
    time.sleep(.3)
    t2.start()
    time.sleep(.3)
    t3.start()
    time.sleep(.3)

    t1.join()
    t2.join()
    t3.join()

CodePudding user response:

First, it makes not much sense to let 3 threads in parallel read on the same socket, it only leads to confusion and race conditions.

The main problem though is that you are assuming that a single recv will always read a single message. But this is not how TCP works. TCP has no concept of a message, but only is a byte stream. A message is an application level concept. A single recv might contain a single message, multiple messages, parts of messages ...

So you have to actually parse the data you get according to the semantics defined by the application protocol, i.e.

  1. initialize some buffer
  2. get some data from the socket and add them to the buffer - don't decode the data
  3. extract all full messages from the buffer, decode and process each of the message separately
  4. leave remaining incomplete messages in the buffer
  5. continue with #2

Apart from that don't blindly throw away errors during recv(..).decode(..). Given that you are using a blocking socket recv will usually only fail if there is a fatal problem with the connection, in which case a retry will not help. The problem is most likely because you are calling decode on incomplete messages which might also mean invalid utf-8 encoding. But since you simply ignore the problem you essentially lose the messages.

  • Related