Home > database >  Why is "kill" is not exiting a thread immediately?
Why is "kill" is not exiting a thread immediately?

Time:07-11

I am trying to write a simple script that spawns a thread that performs a task that may timeout. (For the sake of writing a simple example for StackOverflow I replaced the actual process with a sleep command).

This program spawns a thread and then uses a cond_timedwait to monitor the thread and check if it has timed out. If a timeout occurs it calls the kill method on the thread with a "STOP" signal to notify the thread that it should exit.

use strict;
use threads;
use threads::shared;
use warnings;

my $var :shared;

my $thread = threads->create(sub {

    # Tell the thread how to handle the STOP signal
    local $SIG{'STOP'} = sub {
        print "Stop signal received\n";
        threads->exit();
    };

    # Perform a process that takes some time
    sleep 10;

    # Signal that the thread is complete
    lock($var); cond_signal($var);
});

# Current time   1 second
my $wait_time = time()   1;
my $timeout;

{
    # Wait for the thread to complete or until a timeout has occurred
    lock($var); $timeout = !cond_timedwait($var, $wait_time);
}

# Check if a timeout occurred
if ($timeout) {
    print "A timeout has occurred\n";

    # Signal the thread to stop
    $thread->kill('STOP')->join();
}
else {
    $thread->join();
}

This code runs successfully and prints the following output:

1 second passes...

A timeout has occurred

9 seconds pass...

Stop signal received

The problem is, even though a timeout is detected and the "STOP" signal is sent to the thread, the program still seems to be waiting the full 10 seconds before printing "Stop signal received" and exiting.

I tried changing it so it calls detach instead of join after killing the thread, but then the "Stop signal received" message is never printed which means the program is exiting before the thread cleanly exits. I want to make sure the thread is actually interrupted and exits, because in the real program I will want to kill and retry the process after the timeout has occurred and the process won't work if there is another instance already running on a detached thread.

How can I make it so the thread instantly prints the message and exits when it receives the "STOP" signal?

CodePudding user response:

These "signals" aren't the actual OS signals, and there are operations they won't interrupt

CAVEAT: The thread signalling capability provided by this module does not actually send signals via the OS. It emulates signals at the Perl-level such that signal handlers are called in the appropriate thread. For example, sending $thr->kill('STOP') does not actually suspend a thread (or the whole process), but does cause a $SIG{'STOP'} handler to be called in that thread (as illustrated above).
...
Correspondingly, sending a signal to a thread does not disrupt the operation the thread is currently working on: The signal will be acted upon after the current operation has completed. For instance, if the thread is stuck on an I/O call, sending it a signal will not cause the I/O call to be interrupted such that the signal is acted up immediately.

The granularity of what the "operation" is isn't stated but sleep is clearly uninterruptable so the signal handler runs only after it completes. With a different job to interrupt

use warnings;
use strict;
use feature 'say';

use threads;

say "Start at ", scalar localtime, " (", time, ")";

my $thread = threads->create(sub {

    # Tell the thread how to handle the STOP signal
    $SIG{'STOP'} = sub {
        say "\tStop signal received. Exiting at ", time;
        threads->exit();
    };

    say "\tIn the thread ", threads->tid;

    # Perform a process that takes some time
    #sleep 10;
    do { sleep 1; say "\tnappin'... ($_ sec)" } for 1..10;
});

sleep 3;
$thread->kill('STOP')->join();  # works differently with detach()

say "Main thread done, exiting at ", time;

Output

Start at Thu Jul  7 11:11:27 2022 (1657217487)
        In the thread 1
        nappin'... (1 sec)
        nappin'... (2 sec)
        Stop signal received. Exiting at 1657217490
Main thread done, exiting at 1657217490

With detach instead of join it still stops that loop at the right time but I see no indication that a signal handler ran. (In my tests I have the signal handler also write a file and with detach it doesn't.) It all works the same for me with a shared variable etc, like in the question.

This sleep doesn't matter of course -- but it is all a warning to carefully test with actual jobs that the signal is aimed to stop.

CodePudding user response:

Signals can only be sent to processes. As such, $thread->kill('STOP') can't possibly be sending an actual signal. As such, nothing interrupts sleep.

Between each statement, Perl checks if a "signal" came in. If it has, it handles it. So the "signal" is only handled once sleep completes.

If you had ten one second sleeps instead of one ten second sleep, the wait would be at most one second.

  • Related