I'm trying to analyze and visualize a dataset of movies by creating a violinplot. The features that are relevant here are the rating of the movie and its genres. My problem is that most of the movies have more than one genre (I haven't encoded them yet). Let's take as an example the data below:
genres = [['action', 'drama', 'comedy'], ['comedy', 'drama'], ['action', 'sci-fi', 'comedy']]
rating = [6, 8, 3]
data = {'Rating': rating, 'Genres': genres}
df = pd.DataFrame(data=data)
That would give:
| |Rating| Genres |
|--|------|------------------------------|
|0 | 6 |['action', 'drama', 'comedy'] |
|1 | 8 |['comedy', 'drama'] |
|2 | 3 |['action', 'sci-fi', 'comedy']|
If there was only one genre per film I would just do:
sns.violinplot(x='Genres', y='Rating', data=df)
How can I accomplish the same thing in my case that there can be more than one value in the feature? Any ideas are more than welcome!
CodePudding user response: