I am trying to backup my kafka topic to s3 following this guide.
I have filled in all the blanks for the configuration and specified aws.region eu-north-1.
aws kafkaconnect create-connector \
--capacity "autoScaling={maxWorkerCount=2,mcuCount=1,minWorkerCount=1,scaleInPolicy={cpuUtilizationPercentage=10},scaleOutPolicy={cpuUtilizationPercentage=80}}" \
--connector-configuration \
"connector.class=io.lenses.streamreactor.connect.aws.s3.sink.S3SinkConnector, \
key.converter.schemas.enable=false, \
connect.s3.kcql=INSERT INTO <<S3 Bucket Name>>:my_workload SELECT * FROM source_topic PARTITIONBY _header.year\,_header.month\,_header.day\,_header.hour STOREAS \`JSON\` WITHPARTITIONER=KeysAndValues WITH_FLUSH_COUNT = 5, \
aws.region=us-east-1, \ <----------- changed to eu-north-1
tasks.max=2, \
topics=source_topic, \
schema.enable=false, \
errors.log.enable=true, \
value.converter=org.apache.kafka.connect.storage.StringConverter, \
key.converter=org.apache.kafka.connect.storage.StringConverter " \
--connector-name "backup-msk-to-s3-v1" \
--kafka-cluster '{"apacheKafkaCluster": {"bootstrapServers": "<<MSK broker list>>","vpc": {"securityGroups": [ <<Security Group>> ],"subnets": [ <<Subnet List>> ]}}}' \
--kafka-cluster-client-authentication "authenticationType=NONE" \
--kafka-cluster-encryption-in-transit "encryptionType=PLAINTEXT" \
--kafka-connect-version "2.7.1" \
The bucket is created like this:
aws s3api create-bucket --bucket my-msk-backup --create-bucket-configuration LocationConstraint=eu-north-1
The connector fails due to this error message:
[Worker-026a2ebf1106735e1] [2022-08-30 13:11:15,695] ERROR [test03|task-0] WorkerSinkTask{id=test03-0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:191)
[Worker-026a2ebf1106735e1] org.jclouds.aws.AWSResponseException: request GET https://my-msk-backup.s3.amazonaws.com/?prefix=my_workload&max-keys=1000 HTTP/1.1 failed with code 400, error: AWSError{requestId='redacted', requestToken='/redacted/redacted /redacted s18=', code='AuthorizationHeaderMalformed', message='The authorization header is malformed; the region 'us-east-1' is wrong; expecting 'eu-north-1'', context='{Region=eu-north-1, HostId=/redacted redacted/redacted /redacted s18=}'}
How can I troubleshoot this further?
CodePudding user response:
This issue occurs when you have recently moved your bucket from one region to another (By deleting from one region and recreating in another). As Bucket configurations have an eventual consistency model your bucket might still appear in the old region for some amount of time. Normally one won't be able to create the bucket(same name as they are globally unique) right after deleting in another region but if they are able to create anyway, still this error might appear for a while (beacuse of record not being updated/propgated to other regions) even if you point to the region where you have recreated the bucket. I haven't found anything regarding this a part from this and this.