Home > OS >  Using Go generics to implement a chain of processors
Using Go generics to implement a chain of processors

Time:04-28

I am trying to implement a kind of simple processing pipeline in Go, where each processor has a determined input and output type, and a list of successor processors that take current processor output type as input, and may have successor processors of their own.

I am running into issues on how to add successor processors to the current one, regardless of their output type. I tried using any as a wildcard type like I would do with ? in Java, but Go is not having it.

What I have in Go is this:

type Processor[InputType any, OutputType any] struct {
    nextProcessors     []*Processor[OutputType, any]
    ProcessingFunction func(InputType) OutputType
}

func (b *Processor[InputType, OutputType]) Process(input InputType) {
    result := b.ProcessingFunction(input)
    for _, nextProcessor := range b.nextProcessors {
        nextProcessor.Process(result)
    }
}

func (b *Processor[InputType, OutputType]) AddOutputProcessor(p *Processor[OutputType, any]) {
    b.nextProcessors = append(b.nextProcessors, p)
}

func main() {
    outputer := Processor[int, string]{ProcessingFunction: func(input int) string {
        print(input)
        return string(input)
    }}
    doubler := Processor[int, int]{ProcessingFunction: func(input int) int { return input * 2 }}
    rng := Processor[int, int]{ProcessingFunction: func(input int) int { return rand.Intn(input) }}
    rng.AddOutputProcessor(&doubler)
    doubler.AddOutputProcessor(&outputer)
    rng.Process(20)
}

Which gives me a compilation error:

cannot use &doubler (value of type *Processor[int, int]) as type *Processor[int, any]

Is there a way to ignore the output type of the successor processor? Or should I maybe go a different way about it? I would just like to make sure that successor processors can accept the right type of input.

For reference, here is the interface definition in Java that works the way I intend it to.

public interface Processor<InputType, OutputType> {
    void addOutputProcessor(Processor<OutputType, ?> outputProcessor);
    void process(InputType input);
}

CodePudding user response:

Is there a way to ignore the output type of the successor processor?

No.

In Go any is just a static type (alias of interface{}. It can never be a replacement for Java's unbounded wildcard ?. So *Processor[int, any] is just not the same type as *Processor[int, int] and you can't assign one to the other, as reported by your error message.

In order to construct an arbitrarily long chain you would need to parametrize the Process method itself, but this is not possible in Go 1.18. You must declare all type parameters on the type itself. Though, even if you do this, you will keep incurring in the same issue of not knowing the output type of the next processor.

Generally speaking, using a for loop can't work because the static types of the in/out values keep changing.

I believe the closest you can get without reflection is to implement some sort of composition operator — like . in haskell, via a top-level function. But you would have to manually nest calls.

A simplified example (the type Processor is redundant, but keeping it closer to your code):

package main

import (
    "fmt"
    "strconv"
)

type Processor[In, Out any] func(In) Out

func Process[In, Out any](input In, processor Processor[In, Out]) Out {
    return processor(input)
}

func main() {
    parser := Processor[string, int](func(input string) int { s, _ := strconv.Atoi(input); return s })
    doubler := Processor[int, int](func(input int) int { return input * 2 })
    outputer := Processor[int, string](func(input int) string { return fmt.Sprintf("%d", input) })

    out := Process(Process(Process("20", parser), doubler), outputer)
    fmt.Println(out)
}

Playground: https://go.dev/play/p/Iv-virKATyb

CodePudding user response:

You can't use any keyword to instantiate the value of generic type.

nextProcessors     []*Processor[OutputType, any] // keyword any is not a valid type here

You can actually, but the second parameter always should be interface{}. But it's not a part of answer to your question.

To solve your issue you can use generic interface instead

type IProcess[InputType any] interface {
    Process(input InputType)
}

type Processor[InputType any, OutputType any] struct {
    nextProcessors     []IProcess[OutputType]
    ProcessingFunction func(InputType) OutputType
}

func (b *Processor[InputType, OutputType]) AddOutputProcessor(p IProcess[OutputType]) {
    b.nextProcessors = append(b.nextProcessors, p)
}
  • Related