Home > Back-end >  Extraction of values from pandas column which is a list of dictionaries
Extraction of values from pandas column which is a list of dictionaries

Time:01-23

A pandas dataframe called top_chart_movies, which has a column, genres, that has a list of dictionaries as shown in the picture below enter image description here

The column values has varying number of dictionary items within a list.

How to extract the values as a list and include it into another column genres1, where

top_chart_movies['genres1'].head(2)

1881  ["Drama","Crime"]
3337 ["Drama","Crime"]

I tried this following code, but it didn't work.

top_chart_movies['genres1'] = [value for key, value in top_chart_movies['genres']]

CodePudding user response:

This should work:

top_chart_movies['genres1'] = [[genres_item['name'] for genres_item in genres_list] for genres_list in top_chart_movies["genres"]]

How this works: If we iterate over top_chart_movies["genres"], like this: for genres_list in top_chart_movies["genres"] then for each row, the genres_list would contain a list of dictionaries with keys "id" and "name". For example, in the first row, genres_list would be [{"id": 18, "name": "Drama"}, {"id": 80, "name": "Crime"}].

For each row, we iterate over genres_list, for example for genres_item in genres_list, each iteration in genres_item we get a dictionary. For example, {"id": 18, "name": "Drama"}. Then we take only the "name" part: genres_item["name"].

So, for each row, to get list of "name" elements of the genres, we do [genres_item['name'] for genres_item in genres_list] and we do this in every row like this: [[genres_item['name'] for genres_item in genres_list] for genres_list in top_chart_movies["genres"]]

CodePudding user response:

def get_genre(lst):
    return [item["name"] for item in lst]

top_chart_movies['genres1'] = top_chart_movies['genres'].map(get_genre)

You don't need all the values associated to every key, but only the ones corresponding to each key ["name"].

Once you know how to extract them from one list, you can map that function to your column of lists.

  • Related