Home > Software design >  Create a dataframe which shows the Relationship between 2 Dataframes
Create a dataframe which shows the Relationship between 2 Dataframes

Time:12-21

Hello this is my first Question and I tried everything.

I'm new to Dataframe and SQL and I have an excel table with multiple Movies. It has the Columns Movie Name, Year ,...,and Actors. In the Column Actors are multiple Actors listet, which are separated by , or by /. I created a Dataframe Movies which have a UUID, and all the information I should store for a Movie. I also filtered all Actors which are in all my Movies and created a DataFrame called Actors with also a Unique ID and Name and stored them individually.

DF_MOVIE:

Movie ID Actors
x Person1,Person2,Person5
y Person1

DF_ACTORS:

ActorID Actor
a Person1
b Person 5

What I want to create is a new Dataframe which shows the Relationship between them.

For Example:

DF_ACTORS_MOVIE:

MOVIEID ACTORID
x a
x b

The Problem is: there Actors in my Excel table (and Bookdf). And I want to take both IDS.

Does anybody have an Idea? would appreciate, Thanks..:

I tried iterating both the Dataframe books and tried to create a new Entry on the DF_ACTORS_MOVIE Dataframe with both UUIDS. The Problem is, that there are multiple Actors and they are divided by , and by /.

CodePudding user response:

You can first explode the Actors into a new column:

out = df_actors.merge(df_movie.assign(Actor=df_movie['Actors'].str.split(','))
                                            .explode('Actor').drop(columns='Actors')
                      )

Output:

  ActorID    Actor Movie ID
0       a  Person1        x
1       a  Person1        y
2       b  Person5        x
  • Related