I am building Kafka CDC but following the document, it runs many docker-run commands.
I want to put it all into a docker-compose.yml
but I fail at 1 last command I can not convert to
The below are the commands
docker run -d --name postgres \
-p 5432:5432 \
-e POSTGRES_USER=start_data_engineer \
-e POSTGRES_PASSWORD=password debezium/postgres:12
docker run -d --name zookeeper -p 2181:2181 -p 2888:2888 -p 3888:3888 debezium/zookeeper:1.1
docker run -d --name kafka -p 9092:9092 --link zookeeper:zookeeper debezium/kafka:1.1
docker run -d --name connect -p 8083:8083 --link kafka:kafka \
--link postgres:postgres \
-e BOOTSTRAP_SERVERS=kafka:9092 \
-e GROUP_ID=sde_group \
-e CONFIG_STORAGE_TOPIC=sde_storage_topic \
-e OFFSET_STORAGE_TOPIC=sde_offset_topic debezium/connect:1.1
This is the line I can not convert
docker run -it --rm --name consumer --link zookeeper:zookeeper \
--link kafka:kafka debezium/kafka:1.1 \
watch-topic -a bankserver1.bank.holding --max-messages 1 | grep '^{' | jq
Here is my docker-compose.yml
so far
version: '2'
services:
zookeeper:
image: debezium/zookeeper
ports:
- 2181:2181
- 2888:2888
- 3888:3888
kafka:
image: debezium/kafka
ports:
- 9092:9092
links:
- zookeeper
environment:
- ZOOKEEPER_CONNECT=zookeeper:2181
postgres:
image: debezium/postgres:9.6
ports:
- "5432:5432"
environment:
- POSTGRES_USER=user
- POSTGRES_PASSWORD=password
connect:
image: debezium/connect
ports:
- 8083:8083
- 5005:5005
links:
- kafka
- postgres
- zookeeper
environment:
- BOOTSTRAP_SERVERS=kafka:9092
- GROUP_ID=1
- CONFIG_STORAGE_TOPIC=my_connect_configs
- OFFSET_STORAGE_TOPIC=my_connect_offsets
- STATUS_STORAGE_TOPIC=my_source_connect_statuses
consumer:
image: debezium/kafka:1.1
links:
- zookeeper
- kafka
command: watch-topic -a bankserver1.bank.holding --max-messages 1 | grep '^{' | jq
When I run docker-compose up
, everything run normally. But the consumer
always fail with this output.
The ZOOKEEPER_CONNECT variable must be set, or the container must be linked to one that runs Zookeeper.
consumer_1 | WARNING: Using default BROKER_ID=1, which is valid only for non-clustered installations.
consumer_1 | The ZOOKEEPER_CONNECT variable must be set, or the container must be linked to one that runs Zookeeper.
--- Update For now I just want to read and shootdown. Making sure it works first.
Later then I will have a source handle those reading stuff.
docker run -it --rm --name consumer --link zookeeper:zookeeper --link kafka:kafka debezium/kafka:1.1 watch-topic -a bankserver1.bank.holding | grep --line-buffered '^{' | <your-file-path>/stream.py > my-output/holding_pivot.txt
CodePudding user response:
Following will work...
The points are
- I don't know why, but ZOOKEEPER_CONNECT and KAFKA_BROKER do not be set automatically.
- You must break commands into a list.
- Finally, pipe command had not run inside container.
version: '2'
services:
zookeeper:
image: debezium/zookeeper
ports:
- 2181:2181
- 2888:2888
- 3888:3888
kafka:
image: debezium/kafka
ports:
- 9092:9092
environment:
- ZOOKEEPER_CONNECT=zookeeper:2181
postgres:
image: debezium/postgres:9.6
ports:
- "5432:5432"
environment:
- POSTGRES_USER=user
- POSTGRES_PASSWORD=password
connect:
image: debezium/connect
ports:
- 8083:8083
- 5005:5005
environment:
- BOOTSTRAP_SERVERS=kafka:9092
- GROUP_ID=1
- CONFIG_STORAGE_TOPIC=my_connect_configs
- OFFSET_STORAGE_TOPIC=my_connect_offsets
- STATUS_STORAGE_TOPIC=my_source_connect_statuses
consumer:
image: debezium/kafka:1.1
environment:
- ZOOKEEPER_CONNECT=zookeeper:2181
- KAFKA_BROKER=kafka:9092
command:
- watch-topic
- -a
- bankserver1.bank.holding
- --max-messages
- "1"
CodePudding user response:
the consumer always fail with this output.
As the error says, you need to provide a ZOOKEEPER_CONNECT
. However, you should be using entrypoint
there, not command
.
In any case, I don't know if the Debezium container will have the Python modules for you to pipe into stream.py
or what watch-topic
does, but you don't need another debezium/kafka
container since you can exec into the running one.
docker-compose exec kafka \
bash -c "watch-topic -a bankserver1.bank.holding | grep --line-buffered '^{' | <your-file-path>/stream.py > my-output/holding_pivot.txt"