Home > Software engineering >  terminating blocking goroutines with errgroup
terminating blocking goroutines with errgroup

Time:09-27

I have two tasks that are running in go routines. I am using errgroup. I am not sure how to use the errgroup.WithContext correctly.

In the following code, task1 is returning the error and I would like to terminate task2 (long running) when that happens. Please note that in this example time.sleep is added just to simulate my problem. In reality task1 and task2 are doing real work and does not have any sleep call.

package main

import (
    "context"
    "fmt"
    "golang.org/x/sync/errgroup"
    "time"
)

func task1(ctx context.Context) error {
    time.Sleep(5 * time.Second)
    fmt.Println("first finished, pretend error happened")
    return ctx.Err()
}

func task2(ctx context.Context) error {
    select {
    case <-ctx.Done():
        fmt.Println("task 1 is finished with error")
        return ctx.Err()
    default:
        fmt.Println("second started")
        time.Sleep(50 * time.Second)
        fmt.Println("second finished")
    }
    return nil
}

func test() (err error) {
    ctx := context.Background()
    g, gctx := errgroup.WithContext(ctx)

    g.Go(func() error {
        return task1(gctx)
    })

    g.Go(func() error {
        return task2(gctx)
    })

    err = g.Wait()
    if err != nil {
        fmt.Println("wait done")
    }

    return err
}

func main() {
    fmt.Println("main")
    err := test()
    if err != nil {
        fmt.Println("main err")
        fmt.Println(err.Error())
    }
}


CodePudding user response:

It's up to your tasks to handle context cancellation properly and not time.Sleep inside a select.

As stated in errgroup documentation:

WithContext returns a new Group and an associated Context derived from ctx.

The derived Context is canceled the first time a function passed to Go returns a non-nil error or the first time Wait returns, whichever occurs first.

You are using error group right, but your context handling needs a refactor.

Here is a refacor of your task 2:

func task2(ctx context.Context) error {
    errCh := make(chan bool)

    go func() {
        time.Sleep(50 * time.Second)
        errCh <- true
    }()

    select {
    case <-ctx.Done():
        return fmt.Errorf("context done: %w", ctx.Err())
    case <-errCh:
        return errors.New("task 2 failed")
    }
}

With such select, you wait for the first channel to emit. In this case, it is the context expiration, unless you modify time sleep to be lower. Example playground.

  • Related