We have deployed a DMS replication task to replicate our entire Postgres database to Redshift. The tables are getting created with the correct schemas, but the data isn't coming through to Redshift and getting held up in the S3 bucket DMS uses as an intermediary step. This is all deployed via Terraform.
We've configured the IAM roles as described in the replication instance Terraform docs with all three of dms-access-for-endpoint
, dms-cloudwatch-logs-role
, and dms-vpc-role
IAM roles created. The IAM roles are deployed via a different stack to where DMS is deployed from as the roles are used by another, successfully deployed, DMS instance running a different task.
data "aws_iam_policy_document" "dms_assume_role_document" {
statement {
actions = ["sts:AssumeRole"]
principals {
identifiers = [
"s3.amazonaws.com",
"iam.amazonaws.com",
"redshift.amazonaws.com",
"dms.amazonaws.com",
"redshift-serverless.amazonaws.com"
]
type = "Service"
}
}
}
# Database Migration Service requires the below IAM Roles to be created before
# replication instances can be created. See the DMS Documentation for
# additional information: https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Security.html#CHAP_Security.APIRole
# * dms-vpc-role
# * dms-cloudwatch-logs-role
# * dms-access-for-endpoint
resource "aws_iam_role" "dms_access_for_endpoint" {
name = "dms-access-for-endpoint"
assume_role_policy = data.aws_iam_policy_document.dms_assume_role_document.json
managed_policy_arns = ["arn:aws:iam::aws:policy/service-role/AmazonDMSRedshiftS3Role"]
force_detach_policies = true
}
resource "aws_iam_role" "dms_cloudwatch_logs_role" {
name = "dms-cloudwatch-logs-role"
description = "Allow DMS to manage CloudWatch logs."
assume_role_policy = data.aws_iam_policy_document.dms_assume_role_document.json
managed_policy_arns = ["arn:aws:iam::aws:policy/service-role/AmazonDMSCloudWatchLogsRole"]
force_detach_policies = true
}
resource "aws_iam_role" "dms_vpc_role" {
name = "dms-vpc-role"
description = "DMS IAM role for VPC permissions"
assume_role_policy = data.aws_iam_policy_document.dms_assume_role_document.json
managed_policy_arns = ["arn:aws:iam::aws:policy/service-role/AmazonDMSVPCManagementRole"]
force_detach_policies = true
}
However, on runtime, we're seeing the following logs in CloudWatch:
2022-09-01T16:51:38 [SOURCE_UNLOAD ]E: Not retriable error: <AccessDenied> Access Denied [1001705] (anw_retry_strategy.cpp:118)
2022-09-01T16:51:38 [SOURCE_UNLOAD ]E: Failed to list bucket 'dms-sandbox-redshift-intermediate-storage': error code <AccessDenied>: Access Denied [1001713] (s3_dir_actions.cpp:105)
2022-09-01T16:51:38 [SOURCE_UNLOAD ]E: Failed to list bucket 'dms-sandbox-redshift-intermediate-storage' [1001713] (s3_dir_actions.cpp:209)
We also enabled S3 server access logs on the bucket itself to see whether this would give us more information. This is what we're seeing (anonymised):
<id> dms-sandbox-redshift-intermediate-storage [01/Sep/2022:15:43:32 0000] 10.128.69.80 arn:aws:sts::<account>:assumed-role/dms-access-for-endpoint/dms-session-for-replication-engine <code> REST.GET.BUCKET - "GET /dms-sandbox-redshift-intermediate-storage?delimiter=/&max-keys=1000 HTTP/1.1" 403 AccessDenied 243 - 30 - "-" "aws-sdk-cpp/1.8.80/S3/Linux/4.14.276-211.499.amzn2.x86_64 x86_64 GCC/4.9.3" - <code> SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader s3.eu-west-2.amazonaws.com TLSv1.2 -
The above suggests that a service dms-session-for-replication
is the service in question that is receiving the AccessDenied responses, but we're unable to pinpoint what this is and how we can fix it.
We attempted to add a bucket policy to the S3 bucket itself but this did not work (this also includes the S3 server access logs bucket):
resource "aws_s3_bucket" "dms_redshift_intermediate" {
# Prefixed with `dms-` as that's what the AmazonDMSRedshiftS3Role policy filters on
bucket = "dms-sandbox-redshift-intermediate-storage"
}
resource "aws_s3_bucket_logging" "log_bucket" {
bucket = aws_s3_bucket.dms_redshift_intermediate.id
target_bucket = aws_s3_bucket.log_bucket.id
target_prefix = "log/"
}
resource "aws_s3_bucket" "log_bucket" {
bucket = "${aws_s3_bucket.dms_redshift_intermediate.id}-logs"
}
resource "aws_s3_bucket_acl" "log_bucket" {
bucket = aws_s3_bucket.log_bucket.id
acl = "log-delivery-write"
}
resource "aws_s3_bucket_policy" "dms_redshift_intermediate_policy" {
bucket = aws_s3_bucket.dms_redshift_intermediate.id
policy = data.aws_iam_policy_document.dms_redshift_intermediate_policy_document.json
}
data "aws_iam_policy_document" "dms_redshift_intermediate_policy_document" {
statement {
actions = [
"s3:*"
]
principals {
identifiers = [
"dms.amazonaws.com",
"redshift.amazonaws.com"
]
type = "Service"
}
resources = [
aws_s3_bucket.dms_redshift_intermediate.arn,
"${aws_s3_bucket.dms_redshift_intermediate.arn}/*"
]
}
}
How do we fix the <AccessDenied>
issues that we're seeing on CloudWatch and enable data loading to Redshift? DMS is able to PUT
items in the S3 bucket as we're seeing encrypted CSVs appearing in there (the server access logs also confirm this), but DMS is unable to then GET
the files back out of it for Redshift. The AccessDenied responses also suggest that it's an IAM role issue not a security group issue but our IAM roles are configured as per the docs so we're confused as to what could be causing this issue.
CodePudding user response:
You are right this is an IAM role issue, make sure the role in questions has the following statements added to the policy document,
{
"Effect": "Allow",
"Action": [
"s3:*"
],
"Resource":"arn:aws:s3:::<yourbucketnamehere>/*"
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource":"arn:aws:s3:::<yourbucketnamehere>"
},
{
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets",
"s3:GetBucketLocation"
],
"Resource": "arn:aws:s3:::*"
}
CodePudding user response:
What we thought was an IAM issue, was actually a security group issue. The COPY
command for Redshift was struggling to access S3. By adding a 443 egress rule for HTTPS to the Redshift security group, we were able to pull data through again
resource "aws_security_group_rule" "https_443_egress" {
type = "egress"
description = "Allow HTTP egress from DMS SG"
protocol = "tcp"
to_port = 443
from_port = 443
security_group_id = aws_security_group.redshift.id
cidr_blocks = ["0.0.0.0/0"]
}
So if you're experiencing the same issue as the question, check whether Redshift has access to S3 via HTTPS.