Following the best practice Take advantage of execution environment reuse to improve the performance of your function, I am investigating if caching the boto3
client has any negative effect while using Lambda Provisioned Concurrency. The boto3
client is cached through @lru_cache
decorator and it is lazy-initialized. Now, the concern is that the underlying credentials of boto3
client are not refreshed because Provisioned Concurrency will keep the execution environment alive for an unknown amount of time. This lifetime might be longer than the duration of the temporary credentials that Lambda environment injected.
I couldn’t find any doc explaining how this case is handled. Does anyone know how Lambda environment handles the refreshing of credentials in the above case?
CodePudding user response:
If you're using hardcoded credentials:
You have a bigger security issue than "re-used" credentials and should remove them immediately.
From documentation:
Do NOT put literal access keys in your application files. If you do, you create a risk of accidentally exposing your credentials if, for example, you upload the project to a public repository.
Do NOT include files that contain credentials in your project area.
Replace them with an execution role.
If you're using an execution role:
You're not providing any credentials manually for any AWS SDK calls. The credentials for the SDK are coming automatically from the execution role of the Lambda function.
Even if Boto3 role credentials are shared across invocations under the hood for provisioned concurrency (nobody is sure), what would be the issue?
Let Amazon deal with role credentials - it's not your responsibility to manage that at all.
I would worry more about the application code having security flaws as opposed to Amazon's automatically authenticating SDK requests with execution role credentials.
CodePudding user response:
They aren't.
The documentation for Boto3 doesn't do a very good job of describing the credential chain, but the CLI documentation shows the various sources for credentials (and since the CLI is written in Python, it provides authoritative documentation).
Unlike EC2 and ECS, which retrieve role-based credentials from instance metadata, Lambda is provided with credentials in environment variables. The Lambda runtime sets those environment variables when it starts, and every invocation of that Lambda runtime uses the same values.
Concurrent Lambdas receive separate sets of credentials, just like you would if you made concurrent explicit calls to STS AssumeRole
.
Provisioned concurrency is a little trickier. You might think that the same Lambda runtime lives "forever," but in fact it does not: if you repeatedly invoke a Lambda with provisioned concurrency, you'll see that at some point it creates a new CloudWatch log stream. This is an indication that Lambda has started a new runtime. Lambda will finish initializing the new runtime before it stops sending requests to the old runtime, so you don't get a cold start delay.