Home > Back-end >  Pandas – Extracting a phrase in a dict column
Pandas – Extracting a phrase in a dict column

Time:03-12

I have a specific text in one column that I want to extract, and wondering if I could extract a specific sequence from the rows in that column and add them to a new column.

From this:

|studios|
|-------|
|[{'mal_id': 14, 'name': 'Sunrise'}]|
|[{'mal_id': 34, 'name': 'Hal Film Maker'}]|
|[{'mal_id': 18, 'name': 'Toei Animation'}]|
|[]|
|[{'mal_id': 455, 'name': 'Palm Studio'}]|

To this:

|studios|
|-------|
|Sunrise|
|Hal Film Maker|
|Toei Animation|
|[]|
|Palm Studio|

CodePudding user response:

You can use .str to access indexes/keys from the lists/dicts of items in a column, and use a combination of pipe and where to fallback to the original values where the result from .str returns NaN:

df['studios'] = df['studios'].str[0].str['name'].pipe(lambda x: x.where(x.notna(), df['studios']))

Note: you may need to convert the items in df['studio'] to actual objects, in case they're just strings that look like objects. To do that, run this before you run the above code:

import ast
df['studios'] = df['studios'].apply(ast.literal_eval)
  • Related