I have this dataframe with 4 columns. I want to extract resourceName (i.e IDs ) in one separate column. I tried various methods and loops but unable to seperate it.
Dataset:
Username | Event name | Resources |
---|---|---|
XYZ-DEV_ENV_POST_function | StopInstances | [{"resourceType":"AWS::EC2::Instance","resourceName":"i-05fbb7a"}] |
XYZ-DEV_ENV_POST_function | StartInstances | [{"resourceType":"AWS::EC2::Instance","resourceName":"i-08bd2475"},{"resourceType":"AWS::EC2::Instance","resourceName":"i-0fd69dc1"},{"resourceType":"AWS::EC2::Instance","resourceName":"i-0174dd38aea"}] |
I want one more column IDs which will have IDS from Resource column and ll look like this:
Username | Event name | Resources | IDS |
---|---|---|---|
XYZ-DEV_ENV_POST_function | StopInstances | [{"resourceType":"AWS::EC2::Instance","resourceName":"i-05fbb7a"}] | i-05fbb7a" |
XYZ-DEV_ENV_POST_function | StartInstances | [{"resourceType":"AWS::EC2::Instance","resourceName":"i-08bd2475"},{"resourceType":"AWS::EC2::Instance","resourceName":"i-0fd69dc1"},{"resourceType":"AWS::EC2::Instance","resourceName":"i-0174dd38aea"}] | i-08bd2475 , i-0fd69dc1 , i-0174 |
Here is output of data.head(2).to_dict():
{'Date': {0: '28-02-2022', 1: '28-02-2022'}, 'Event name': {0: 'StopInstances', 1: 'StartInstances'}, 'Resources': { 0: '[{"resourceType":"AWS::EC2::Instance","resourceName":"i-05fbb7a"}]', 1: '[{"resourceType":"AWS::EC2::Instance","resourceName":"i-08bd2475"},{"resourceType":"AWS::EC2::Instance","resourceName":"i-0fd69dc1"},{"resourceType":"AWS::EC2::Instance","resourceName":"i-0174dd38aea"}]'}, 'User name': {0: 'XYZ-DEV_ENV_POST_function', 1: 'XYZ-DEV_ENV_POST_function'}}
Thanks and Regards
CodePudding user response:
df['ID'] = df['Resources'].apply(lambda x: ','.join([i['resourceName'] for i in eval(x)]))
Date ... ID
0 28-02-2022 ... i-05fbb7a
1 28-02-2022 ... i-08bd2475,i-0fd69dc1,i-0174dd38aea