Home > Enterprise >  Problems joining two Pandas Dataframes
Problems joining two Pandas Dataframes

Time:05-25

I'm trying to create a report of the cards I have in Trello through Rest API, where I need to show in the same report the card data and the names of the members assigned to each card.

The problem is that the Trello JSON is very cumbersome, and I need to make several queries, and then merge the different data frames.

I'm currently stuck, trying to add the cardmember names to the main card data frame.

I'm sending you a summary of the problem:

I have created the main data frame (trello_dataframe), where I have card level information from Trello, including the "ID Members" column (trello_dataframe['ID Members'], in list form, which I need to merge with another data frame.

More info about trello_dataframe: https://prnt.sc/boC6OL50Glwu

The second data frame (df_response_members) results from the query at the board member level, where I have 3 columns (ID Members (df_response_members['ID Members']), FullName (df_response_members['Member (Full Name)']), and Username (df_response_members['Member (Username)']).

More info about "df_response_members": https://prnt.sc/x6tmzI04rohs

Now I want to merge these two data frames, grouped by df_response_members['ID Members'], so that the full name and username of the card members appear in the card data frame (it's the main one).

The problem occurs when I try to merge the two data frames, with the following code, and I get the error

TypeError: unhashable type: 'list'.

at

trello_dataframe = pd.merge(df_response_members,trello_dataframe, on="ID Members", how='outer')

Here is how I would like to see the main data frame: https://prnt.sc/7PSTmG2zahZO

Thank you in advance!

CodePudding user response:

You can't do that for two reasons: A) as the error says, lists aren't hashable, and DataFrame operations tipically don't work on unhashable data types, and, B) you are trying to merge a list column with a string column. Both column types should be the same in order to perform a merge.

A solution could be to first use DataFrame.explode() (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.explode.html) on your first DataFrame trello_dataframe using the 'ID Members' column, this will generate an independent row for each 'ID Member' on each list. Now you can perform your merge with this DataFrame.

To convert back to your desired format you can use GroupBy, as stated here: How to implode(reverse of pandas explode) based on a column.

  • Related