Home > Software design >  Import protobuf file from GitHub repository
Import protobuf file from GitHub repository

Time:11-02

I currently have two protobuf repos: api and timestamp:

timestamp Repo:

- README.md
- timestamp.proto
- timestamp.pb.go
- go.mod
- go.sum

api Repo:

- README.md
- protos/
  - dto1.proto
  - dto2.proto

Currently, timestamp contains a reference to a timestamp object that I want to use in api but I'm not sure how the import should work or how I should modify the compilation process to handle this. Complicating this process is the fact that the api repo is compiled to a separate, downstream repo for Go called api-go.

For example, consider dto1.proto:

syntax = "proto3";
package api.data;

import "<WHAT GOES HERE?>";

option go_package = "github.com/my-user/api/data"; // golang

message DTO1 {
    string id = 1;
    Timestamp timestamp = 2;
}

And my compilation command is this:

find $GEN_PROTO_DIR -type f -name "*.proto" -exec protoc \
    --go_out=$GEN_OUT_DIR --go_opt=module=github.com/my-user/api-go \
    --go-grpc_out=$GEN_OUT_DIR --go-grpc_opt=module=github.com/my-user/api-go \
    --grpc-gateway_out=$GEN_OUT_DIR --grpc-gateway_opt logtostderr=true \
    --grpc-gateway_opt paths=source_relative --grpc-gateway_opt 
    generate_unbound_methods=true \{} \;

Assuming I have a definition in timestamp for each of the programming languages I want to compile api into, how would I import this into the .proto file and what should I do to ensure that the import doesn't break in my downstream repo?

CodePudding user response:

There is no native notion of remote import paths with protobuf. So the import path has to be relative to some indicated local filesystem base path (specified via -I / --proto_path).

Option 1

Generally it is easiest to just have a single repository with protobuf definitions for your organisation - e.g. a repository named acme-contract

.
└── protos
    └── acme
        ├── api
        │   └── data
        │       ├── dto1.proto
        │       └── dto2.proto
        └── timestamp
            └── timestamp.proto

Your dto1.proto will look something like:

syntax = "proto3";

package acme.api.data;

import "acme/timestamp/timestamp.proto";

message DTO1 {
  string id = 1;
  acme.timestamp.Timestamp timestamp = 2;
}

As long as you generate code relative to the protos/ dir of this repository, there shouldn't be an issue.

Option 2

There are various alternatives whereby you continue to have definitions split over various repositories, but you can't really escape the fact that imports are filesystem relative.

Historically that could be handled by manually cloning the various repositories and arranging directories such that the path are relative, or by using -I to point to various locations that might intentionally or incidentally contain the proto files (e.g. in $GOPATH). Those strategies tend to end up being fairly messy and difficult to maintain.

buf makes things somewhat easier now. If you were to have your timestamp repo:

.
├── buf.gen.yaml
├── buf.work.yaml
├── gen
│   └── acme
│       └── timestamp
│           └── timestamp.pb.go
├── go.mod
├── go.sum
└── protos
    ├── acme
    │   └── timestamp
    │       └── timestamp.proto
    ├── buf.lock
    └── buf.yaml

timestamp.proto looking like:

syntax = "proto3";

package acme.timestamp;

option go_package = "github.com/my-user/timestamp/gen/acme/timestamp";

message Timestamp {
  int64 unix = 1;
}

buf.gen.yaml looking like:

version: v1
plugins:
  - name: go
    out: gen
    opt: paths=source_relative
  - name: go-grpc
    out: gen
    opt:
      - paths=source_relative
      - require_unimplemented_servers=false
  - name: grpc-gateway
    out: gen
    opt:
      - paths=source_relative
      - generate_unbound_methods=true

... and everything under gen/ has been generated via buf generate.

Then in your api repository:

.
├── buf.gen.yaml
├── buf.work.yaml
├── gen
│   └── acme
│       └── api
│           └── data
│               ├── dto1.pb.go
│               └── dto2.pb.go
└── protos
    ├── acme
    │   └── api
    │       └── data
    │           ├── dto1.proto
    │           └── dto2.proto
    ├── buf.lock
    └── buf.yaml

With buf.yaml looking like:

version: v1
name: buf.build/your-user/api
deps:
  - buf.build/your-user/timestamp
breaking:
  use:
    - FILE
lint:
  use:
    - DEFAULT

dto1.proto looking like:

syntax = "proto3";

package acme.api.data;

import "acme/timestamp/timestamp.proto";

option go_package = "github.com/your-user/api/gen/acme/api/data";

message DTO1 {
  string id = 1;
  acme.timestamp.Timestamp timestamp = 2;
}

and buf.gen.yaml the same as in the timestamp repo.

The code generated via buf generate will depend on the timestamp repository via Go modules:

// Code generated by protoc-gen-go. DO NOT EDIT.
// versions:
//  protoc-gen-go v1.28.1
//  protoc        (unknown)
// source: acme/api/data/dto1.proto

package data

import (
    timestamp "github.com/your-user/timestamp/gen/acme/timestamp"
    protoreflect "google.golang.org/protobuf/reflect/protoreflect"
    protoimpl "google.golang.org/protobuf/runtime/protoimpl"
    reflect "reflect"
    sync "sync"
)

// <snip>

Note that if changes are made to dependencies you'll need to ensure that both buf and Go modules are kept relatively in sync.

Option 3

If you prefer not to leverage Go modules for importing generated pb code, you could also look to have a similar setup to Option 2, but instead generate all code into a separate repository (similar to what you're doing now, by the sounds of it). This is most easily achieved by using buf managed mode, which will essentially make it not require ignore any go_modules directives.

In api-go:

.
├── buf.gen.yaml
├── go.mod
└── go.sum

With buf.gen.yaml containing:

version: v1
managed:
  enabled: true
  go_package_prefix:
    default: github.com/your-user/api-go/gen
plugins:
  - name: go
    out: gen
    opt: paths=source_relative
  - name: go-grpc
    out: gen
    opt:
      - paths=source_relative
      - require_unimplemented_servers=false
  - name: grpc-gateway
    out: gen
    opt:
      - paths=source_relative
      - generate_unbound_methods=true

You'd then need to generate code for each respective repo (bushed to BSR):

$ buf generate buf.build/your-user/api
$ buf generate buf.build/your-user/timestamp

After which you should have some generated code for both:

.
├── buf.gen.yaml
├── gen
│   └── acme
│       ├── api
│       │   └── data
│       │       ├── dto1.pb.go
│       │       └── dto2.pb.go
│       └── timestamp
│           └── timestamp.pb.go
├── go.mod
└── go.sum

And the imports will be relative to the current module:

// Code generated by protoc-gen-go. DO NOT EDIT.
// versions:
//  protoc-gen-go v1.28.1
//  protoc        (unknown)
// source: acme/api/data/dto1.proto

package data

import (
    timestamp "github.com/your-user/api-go/gen/acme/timestamp"
    protoreflect "google.golang.org/protobuf/reflect/protoreflect"
    protoimpl "google.golang.org/protobuf/runtime/protoimpl"
    reflect "reflect"
    sync "sync"
)

// <snip>

All in all, I'd recommend Option 1 - consolidating your protobuf definitions into a single repository (including vendoring 3rd party definitions) - unless there is a particularly strong reason not to.

  • Related