Why does my use of "isin" to filter my data frame's rows by column based on values in-CodePudding

I'm trying to build a function that takes specific movie genres linked to a moiveId stored as a list and returns other movies that share one or more of those genres. I can create the list and have confirmed it is a list, but when I use "isin" to use this as a filter, I get a blank dataframe.

First I remove the "|" deliminator in the genres column

inner_join_movies_ratings.genres = inner_join_movies_ratings.genres.str.split("|")

This gives me the following data frame where the "genres" column is an object.

Data frame

Next, I create a variable "genre_list" that contains the genres associated with movie Id entered by the user.

def input_output(x):
    
    y = inner_join_movies_ratings.loc[inner_join_movies_ratings.movieId == x]
    
    #get genres
    genre_list = y.genres[0]
    print(genre_list)
    
    a = inner_join_movies_ratings[inner_join_movies_ratings['genres'].isin(genre_list)]
    print(a)

instead of a returning a data frame of all the movies that contain one of the genres listed in genre_list, I get:

['Adventure', 'Animation', 'Children', 'Comedy', 'Fantasy'] # The contents of genre_list
Empty DataFrame
Columns: [movieId, title, genres, rating]
Index: []

CodePudding user response：

You can use set intersection to test if two lists overlap. Use apply() to check this for every row.

genre_set = set(y.genres[0])

a = inner_join_movies_ratings[inner_join_movies_ratings['genres'].apply(lambda g: len(genre_set.intersection(g)) > 0)]