Home > other >  Pandas loop over each line of a column and append the corresponding value in a new column
Pandas loop over each line of a column and append the corresponding value in a new column

Time:12-28

I have the following CSV with EC2 instances in them:

instanceID,region
i-0020cad819e7393c0,DUB
i-00006ea9f2460375f,DUB

I want to create a column based on the tags of those instances, for example, the name of those instances.

import boto3
import pandas as pd

ec2 = boto3.resource('ec2')

def get_instance_tags(instance):
    tags = {}
    instance_obj = ec2.Instance(instance)
    for tag in instance_obj.tags:
        if tag['Key'] == 'Name':
            tags['Instance Name'] = tag['Value']
        if tag['Key'] == 'Description':
            tags['Description'] = tag['Value']
    return tags

The above function returns a dictionary with tags for a given instance.

I want to loop over each Instance ID in my .csv file, and append the corresponding value of a tag in a new column. For example:

instanceID,region,Name
i-0020cad819e1234,DUB,Name1
i-00006ea9f241234,DUB,Nam2

I thought something like this could work:

for i in df['instanceID']:
    tags = get_instance_tags(i)
    name = tags.get('Instance Name')
    df['Name'] = name

The above just copies the same value to all cells.

I'm not sure which approach I should go with here. I find it difficult to Google the exact term to find the solution.

CodePudding user response:

For your code to work, you need to replace line:

df['Name'] = name

with

 df.loc[df['instanceID'] == i, 'Name'] = name

otherwise you keep updating the entire df in each iteration.

Ref: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html

  • Related