Home > Blockchain >  Why does my program not shut down when I use goroutine in main?
Why does my program not shut down when I use goroutine in main?

Time:07-26

Context

Please, read the comments in code carefully. Everything is in them.

In case you have experience using discordgo

The full code can be found here: https://github.com/telephrag/kubinka/tree/bug (see packages strg and main) With addition of goroutine command handlers stop working properly as well. Everything related to interaction with database (storing and removing from database on /deploy and /return respectively) is not working at all. Users receive only "The application did not respond" message instead of proper response (see packages starting with cmd_ prefix).

package main

import (
    "context"
    "fmt"
    "log"
    "os"
    "os/signal"
    "syscall"
    "time"

    "go.etcd.io/bbolt"
)

/*  TO REPRODUCE:
Start the program wait a few seconds and press ^C.
Expect the case of program not shutting down after few attempts.
*/

func WatchExpirations(ctx context.Context, db *bbolt.DB, bkt string) error {
    timeout := time.After(time.Second * 5)
    for {
        select {
        case <-timeout:
            tx, err := db.Begin(true)
            if err != nil {
                return fmt.Errorf("bolt: failed to start transaction")
            }

            bkt := tx.Bucket([]byte(bkt))
            c := bkt.Cursor()
            for k, v := c.First(); k != nil; k, v = c.Next() {
                // do stuff with bucket...
                fmt.Println(v) // check if v matches condition, delete if does

                if err := tx.Commit(); err != nil { // BUG: commiting transaction in a loop
                    tx.Rollback()
                    return fmt.Errorf("bolt: failed to commit transaction: %w", err)
                }
                timeout = time.After(time.Second * 5)
            }

        case <-ctx.Done():
            return ctx.Err()
        }
    }
}

func main() {
    ctx, cancel := context.WithCancel(context.Background())

    db, err := bbolt.Open("kubinka.db", 0666, nil)
    if err != nil {
        log.Panicf("failed to open db %s: %v", "kubinka.db", err)
    }

    if err = db.Update(func(tx *bbolt.Tx) error {
        _, err := tx.CreateBucketIfNotExists([]byte("players"))
        if err != nil {
            return fmt.Errorf("failed to create bucket %s: %w", "players", err)
        }
        return nil
    }); err != nil {
        log.Panic(err)
    }

    defer func() { // BUG?: Panicing inside defer
        if err := db.Close(); err != nil { // will close normally in debug mode
            log.Panicf("error closing db conn: %v", err) // will stuck otherwise
        }
    }()

    // use `ds` to handle commands from user while storing ctx internally

    go func() {
        err = WatchExpirations(ctx, db, "players")
        if err != nil {
            log.Printf("error while watching expirations in db")
            cancel()
        }
    }()

    interrupt := make(chan os.Signal, 1)
    signal.Notify(interrupt, syscall.SIGTERM, syscall.SIGINT)
    for {
        select {
        // as was seen in the debugger this branch is being reached
        // however than program stalls eternally
        case <-interrupt:
            log.Println("Execution stopped by user")
            cancel()
            return // is called but program doesn't stop
        case <-ctx.Done():
            log.Println("ctx cancelled")
            return
        default:
            time.Sleep(time.Millisecond * 200)
        }
    }
}

CodePudding user response:

As per the comment in your repo the issue appears to have been here:

tx, err := db.Begin(true)
if err != nil {
   return fmt.Errorf("bolt: failed to start transaction")
}
bkt := tx.Bucket([]byte(bkt))
c := bkt.Cursor()
for k, v := c.First(); k != nil; k, v = c.Next() {
    // do stuff with bucket...
    fmt.Println(v) // check if v matches condition, delete if does

    if err := tx.Commit(); err != nil { // BUG: commiting transaction in a loop
        tx.Rollback()
        return fmt.Errorf("bolt: failed to commit transaction: %w", err)
    }
    timeout = time.After(time.Second * 5)
}

The loop could iterate 0-many times.

  • If there are no iterations - tx is not committed and timeout not reset (so case <-timeout: will not be triggered again).
  • If there are more than one iterations - you will attempt to tx.Commit() multiple times (an error).

This probably led to the issue you saw; the bolt Close function:

Close releases all database resources. All transactions must be closed before closing the database.

So if there is a transaction running Close blocks until is completes (internally bolt locks a mutex when the transaction begins and releases it when done).

The solution is to ensure that the transaction is always closed (and only closed once).

  • Related