A pandas dataframe called top_chart_movies, which has a column, genres, that has a list of dictionaries as shown in the picture below
The column values has varying number of dictionary items within a list.
How to extract the values as a list and include it into another column genres1, where
top_chart_movies['genres1'].head(2)
1881 ["Drama","Crime"]
3337 ["Drama","Crime"]
I tried this following code, but it didn't work.
top_chart_movies['genres1'] = [value for key, value in top_chart_movies['genres']]
CodePudding user response:
This should work:
top_chart_movies['genres1'] = [[genres_item['name'] for genres_item in genres_list] for genres_list in top_chart_movies["genres"]]
How this works: If we iterate over top_chart_movies["genres"]
, like this: for genres_list in top_chart_movies["genres"]
then for each row, the genres_list
would contain a list of dictionaries with keys "id"
and "name"
. For example, in the first row, genres_list
would be [{"id": 18, "name": "Drama"}, {"id": 80, "name": "Crime"}]
.
For each row, we iterate over genres_list
, for example for genres_item in genres_list
, each iteration in genres_item
we get a dictionary. For example, {"id": 18, "name": "Drama"}
. Then we take only the "name"
part: genres_item["name"]
.
So, for each row, to get list of "name"
elements of the genres, we do [genres_item['name'] for genres_item in genres_list]
and we do this in every row like this: [[genres_item['name'] for genres_item in genres_list] for genres_list in top_chart_movies["genres"]]
CodePudding user response:
def get_genre(lst):
return [item["name"] for item in lst]
top_chart_movies['genres1'] = top_chart_movies['genres'].map(get_genre)
You don't need all the values associated to every key, but only the ones corresponding to each key ["name"]
.
Once you know how to extract them from one list, you can map that function to your column of lists.