I'm doing pre-processing tasks using EC2.
I execute shell commands using the userdata variable. The last line of my userdata has sudo shutdown now -h
. So the instance gets terminated automatically once the pre-processing task completed.
This is how my code looks like.
import boto3
userdata = '''#!/bin/bash
pip3 install boto3 pandas scikit-learn
aws s3 cp s3://.../main.py .
python3 main.py
sudo shutdown now -h
'''
def launch_ec2():
ec2 = boto3.resource('ec2',
aws_access_key_id="",
aws_secret_access_key="",
region_name='us-east-1')
instances = ec2.create_instances(
ImageId='ami-0c02fb55956c7d316',
MinCount=1,
MaxCount=1,
KeyName='',
InstanceInitiatedShutdownBehavior='terminate',
IamInstanceProfile={'Name': 'S3fullaccess'},
InstanceType='m6i.4xlarge',
UserData=userdata,
InstanceMarketOptions={
'MarketType': 'spot',
'SpotOptions': {
'SpotInstanceType': 'one-time',
}
}
)
print(instances)
launch_ec2()
The problem is, sometime when there is an error in my python script, the script dies and the instance get terminated.
Is there a way I can collect error/info logs and send it to cloudwatch before the instance get terminated? This way, I would know what went wrong.
CodePudding user response:
You can achieve the desired behavior by leveraging bash functionality.
You could in fact create a log file for the entire execution of the UserData, and you could use trap
to make sure that the log file is copied over to S3 before terminating if an error occurs.
Here's how it could look:
#!/bin/bash -xe
exec &>> /tmp/userdata_execution.log
upload_log() {
aws s3 cp /tmp/userdata_execution.log s3://... # use a bucket of your choosing here
}
trap 'upload_log' ERR
pip3 install boto3 pandas scikit-learn
aws s3 cp s3://.../main.py .
python3 main.py
sudo shutdown now -h
A log file (/tmp/userdata_execution.log
) that contains stdout and stderr will be generated for the UserData; if there is an error during the execution of the UserData, the log file will be upload to an S3 bucket.
If you wanted to, you could of course also stream the log file to CloudWatch, however to do so you would have to install the CloudWatch agent on the instance and configure it accordingly. I believe that for your use case uploading the log file to S3 is the best solution.