Home > Mobile >  Producer Avro data from Windows with Docker
Producer Avro data from Windows with Docker

Time:04-06

I'm following How to transform a stream of events tutorial. Everything works fine until topic creation part:

Under title Produce events to the input topic:

docker exec -i schema-registry /usr/bin/kafka-avro-console-producer --topic raw-movies --bootstrap-server broker:9092 --property value.schema="$(< src/main/avro/input_movie_event.avsc)"

I'm getting:

<: The term '<' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was included, verify that the path is correct and try again.

What would be proper way of calling Avro schema file in --property value.schema ?

All Confluent Kafka servers are running fine. Schema registry is empty at this point:

PS C:\Users\Joe> curl -X GET http://localhost:8081/subjects
[]

How can I register Avro file in Schema manually from CLI? I'm not finding options for that in Schema Registry API..

My thinking was - if I insert schema manually than I would be able to call it this way.

EDIT 1

Tried entering Avro file path as variable in Power shell like:

$avroPath = 'D:\ConfluentKafkaDocker\kafkaStreamsDemoProject\src\main\avro\input_movie_event.avsc'

And than executing:

docker exec -i schema-registry /usr/bin/kafka-avro-console-producer --topic raw-movies --bootstrap-server broker:9092 --property value.schema=$avroPath

But that didn't work.

EDIT 2

Manage to get it working with:

$avroPath = 'D:\ConfluentKafkaDocker\kafkaStreamsDemoProject\src\main\avro\input_movie_event.avsc'
docker exec -i schema-registry /usr/bin/kafka-avro-console-producer --topic raw-movies --bootstrap-server broker:9092 --property value.schema.file=$avroPath

But now I'm getting:

org.apache.kafka.common.config.ConfigException: Error reading schema from D:\ConfluentKafkaDocker\kafkaStreamsDemoProject\src\main\avro\input_movie_event.avsc at io.confluent.kafka.formatter.SchemaMessageReader.getSchemaString(SchemaMessageReader.java:260) at io.confluent.kafka.formatter.SchemaMessageReader.getSchema(SchemaMessageReader.java:222) at io.confluent.kafka.formatter.SchemaMessageReader.init(SchemaMessageReader.java:153) at kafka.tools.ConsoleProducer$.main(ConsoleProducer.scala:43) at kafka.tools.ConsoleProducer.main(ConsoleProducer.scala)

input_movie_event.avsc:

{
  "namespace": "io.confluent.developer.avro",
  "type": "record",
  "name": "RawMovie",
  "fields": [
    {"name": "id", "type": "long"},
    {"name": "title", "type": "string"},
    {"name": "genre", "type": "string"}
  ]
} 

It's copy-pasted from example so I see not reason why it would be incorrectly formatted.

EDIT 3

Tried with forward slash since Power shell works now with it:

 value.schema.file=src/main/avro/input_movie_event.avsc

and than with backslash:

value.schema.file=src\main\avro\input_movie_event.avsc

I'm getting same error as in Edit 2 - so it looks like this flag value.schema.file is not working properly.

EDIT 4

tried with value.schema="$(cat src/main/avro/input_movie_event.avsc)" as suggested here:

Error I'm getting now is:

[2022-04-05 10:17:24,135] ERROR Could not parse Avro schema (io.confluent.kafka.schemaregistry.avro.AvroSchemaProvider) org.apache.avro.SchemaParseException: com.fasterxml.jackson.core.JsonParseException: Unexpected character ('n' (code 110)): was expecting double-quote to start field name at [Source: (String)"{ namespace: io.confluent.developer.avro, type: record, name: RawMovie, fields: [ {name: id, type: long},
{name: title, type: string}, {name: genre, type: string} ] }"; line: 1, column: 6] at org.apache.avro.Schema$Parser.parse(Schema.java:1427) at org.apache.avro.Schema$Parser.parse(Schema.java:1413) at io.confluent.kafka.schemaregistry.avro.AvroSchema.(AvroSchema.java:70) at io.confluent.kafka.schemaregistry.avro.AvroSchemaProvider.parseSchema(AvroSchemaProvider.java:54) at io.confluent.kafka.schemaregistry.SchemaProvider.parseSchema(SchemaProvider.java:63) at io.confluent.kafka.formatter.SchemaMessageReader.parseSchema(SchemaMessageReader.java:212) at io.confluent.kafka.formatter.SchemaMessageReader.getSchema(SchemaMessageReader.java:224) at io.confluent.kafka.formatter.SchemaMessageReader.init(SchemaMessageReader.java:153) at kafka.tools.ConsoleProducer$.main(ConsoleProducer.scala:43) at kafka.tools.ConsoleProducer.main(ConsoleProducer.scala)

In error it says that it was expecting double-quote to start field name and also that name: id and in file I have:

"fields": [
        {"name": "id", "type": "long"},
        {"name": "title", "type": "string"},
        {"name": "genre", "type": "string"}
      ]

Why is it parsing it incorrectly, like there are not double-quotes when in file they are actually there?

EDIT 6

tried with value.schema="$(type src/main/avro/input_movie_event.avsc)"

since type is equivalent for cat on Windows - got same error as in Edit 5. Tried with get-content as suggested here - same error.

CodePudding user response:

How can I register Avro file in Schema manually from CLI?

You would not use a Producer, or Docker.

You can use Postman and send POST request (or the Powershell equivalent of curl) to the /subjects endpoint, like the Schema Registry API documentation says for registering schemas.


After that, using value.schema.id, as linked, will work.

Or, if you don't want to install anything else, I'd stick with value.schema.file. That being said, you must start the container with this file (or whole src\main\avro folder) mounted as a Docker volume, which would not be referenced by a Windows path when you actually use it as part of a docker exec command. My linked answer referring to the cat usage assumes your files are on the same filesystem.

Otherwise, the exec command is being interpreted by Powershell, first, so input redirection won't work, and type would be the correct command, but $() syntax might not be, as that's for UNIX shells;

Related - PowerShell: Store Entire Text File Contents in Variable

  • Related