I have the following CSV with EC2 instances in them:
instanceID,region
i-0020cad819e7393c0,DUB
i-00006ea9f2460375f,DUB
I want to create a column based on the tags of those instances, for example, the name of those instances.
import boto3
import pandas as pd
ec2 = boto3.resource('ec2')
def get_instance_tags(instance):
tags = {}
instance_obj = ec2.Instance(instance)
for tag in instance_obj.tags:
if tag['Key'] == 'Name':
tags['Instance Name'] = tag['Value']
if tag['Key'] == 'Description':
tags['Description'] = tag['Value']
return tags
The above function returns a dictionary with tags for a given instance.
I want to loop over each Instance ID
in my .csv file, and append the corresponding value of a tag in a new column. For example:
instanceID,region,Name
i-0020cad819e1234,DUB,Name1
i-00006ea9f241234,DUB,Nam2
I thought something like this could work:
for i in df['instanceID']:
tags = get_instance_tags(i)
name = tags.get('Instance Name')
df['Name'] = name
The above just copies the same value to all cells.
I'm not sure which approach I should go with here. I find it difficult to Google the exact term to find the solution.
CodePudding user response:
For your code to work, you need to replace line:
df['Name'] = name
with
df.loc[df['instanceID'] == i, 'Name'] = name
otherwise you keep updating the entire df in each iteration.
Ref: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html