Home > Blockchain >  create an edge list that groups films by genre, i.e. join two films of the same genre
create an edge list that groups films by genre, i.e. join two films of the same genre

Time:01-09

I've just been using python and I want to build an edge list that groups together the titles of movies that have a genre in common. I have this dictionary:

dictionary_title_withonegenere=
{28: ['Avatar: The Way of Water', 'Violent Night', 'Puss in Boots: The Last Wish'],
12: ['Avatar: The Way of Water', 'The Chronicles of Narnia: The Lion, the Witch and the Wardrobe'],
16: ['Puss in Boots: The Last Wish', 'Strange World']}

now 28,12,16 are the genres of movies.I want to create an edge list that groups movies by genre, i.e. I join two movies of the same genre:

source                         target 
Avatar: The Way of Water       Violent Nigh
Avatar: The Way of Water       Puss in Boots: The Last Wish
Violent Nigh                   Puss in Boots: The Last Wish
Avatar: The Way of Water       The Chronicles of Narnia: The Lion, the Witch 
                               and the Wardrobe
Puss in Boots: The Last Wish   Strange World

This is my idea:

edges=[]
genres=[28,12,16]

    for i in range(0,len(genres)):
            for genres[i] in dictionary_title_withonegenere[genres[i]]:
                for genres[i] in dictionary_title_withonegenere[genres[i]][1:]:
                    edges.append({"sorce":dictionary_title_withonegenere[genres[i]][0],"target":dictionary_title_withonegenere[genres[i]][y]})

    print((edges))

My code don't work. How can i do?

CodePudding user response:

You can check if 2 movies have common genre by creating an intermediate datastructure, that is to have a mapping with movie->genres and with that datastructure, you can iterate over all movies and see if there is any common genre and create an edge between them.

from pprint import pprint
dictionary_title_withonegenere= {28: ['Avatar: The Way of Water', 'Violent Night', 'Puss in Boots: The Last Wish'],
12: ['Avatar: The Way of Water', 'The Chronicles of Narnia: The Lion, the Witch and the Wardrobe'],
16: ['Puss in Boots: The Last Wish', 'Strange World']}

movies_with_genre = {}
movies_set = set()
for genre, movies in dictionary_title_withonegenere.items():
    for movie in movies:
        movies_with_genre.setdefault(movie, set()).add(genre)
        movies_set.add(movie)
    
pprint(movies_with_genre)
movie_list = list(movies_set)
edges = []
for i in range(len(movie_list)):
    source_movie= movie_list[i]
    for j in range(i   1, len(movie_list)):
        target_movie = movie_list[j]
        common_genre = False
        for source_genre in movies_with_genre[source_movie]:
            if source_genre in movies_with_genre[target_movie]:
                common_genre = True
                break
        if common_genre:
            edges.append({"sorce":source_movie, "target":target_movie})
pprint(edges)

OUTPUT

{'Avatar: The Way of Water': {28, 12},
 'Puss in Boots: The Last Wish': {16, 28},
 'Strange World': {16},
 'The Chronicles of Narnia: The Lion, the Witch and the Wardrobe': {12},
 'Violent Night': {28}}
[{'sorce': 'Strange World', 'target': 'Puss in Boots: The Last Wish'},
 {'sorce': 'Avatar: The Way of Water',
  'target': 'Puss in Boots: The Last Wish'},
 {'sorce': 'Avatar: The Way of Water', 'target': 'Violent Night'},
 {'sorce': 'Avatar: The Way of Water',
  'target': 'The Chronicles of Narnia: The Lion, the Witch and the Wardrobe'},
 {'sorce': 'Puss in Boots: The Last Wish', 'target': 'Violent Night'}]

CodePudding user response:

try this

edges=[]
genres=[28,12,16,35,80,99,18,10751,14,36,27,10402,9648,10749,878,10770,53,10752,37]
dictionary_title_withonegenere = {28: ['Avatar: The Way of Water', 'Violent Night', 'Puss in Boots: The Last Wish'],
12: ['Avatar: The Way of Water', 'The Chronicles of Narnia: The Lion, the Witch and the Wardrobe'],
16: ['Puss in Boots: The Last Wish', 'Strange World']}

for i in range(0,len(genres)):
    if genres[i] in dictionary_title_withonegenere:
        genres_list = dictionary_title_withonegenere[genres[i]]
        genres_list_len = len(genres_list)
        if genres_list_len <= 1:
            continue
        for j in range(genres_list_len):
            for k in range(j 1,genres_list_len):
                edges.append({"name_movies":genres_list[j],"onegres_movies":genres_list[k]})

for edge in edges:
    print(f'{edge["name_movies"]: <40}{edge["onegres_movies"]}')

output

Avatar: The Way of Water                Violent Night
Avatar: The Way of Water                Puss in Boots: The Last Wish
Violent Night                           Puss in Boots: The Last Wish
Avatar: The Way of Water                The Chronicles of Narnia: The Lion, the Witch and the Wardrobe
Puss in Boots: The Last Wish            Strange World
  • Related