Home > Blockchain >  how to merge 2 dataframes based on list
how to merge 2 dataframes based on list

Time:11-20

Hi im new to python and I am trying to multiply 2 dataframes by each other based on a list of names to create a 3rd one. the names are in a list.

the first df values do not change with the dates. it looks like:

name    value   <----column names
adam     1
bobb     2
cory     3
dave     4

the second df:

date          cory   dave   bobb   adam   <column names
2005-12-31     1      2      3      4
2006-12-31     2      2      3      4
2007-12-31     3      3      4      4 
2008-12-31     4      4      3      3

I would like to multiply the 2 dfs together based on the names while still in order of the dates so it looks something like this:

date          cory   dave   bobb   adam   <column names
2005-12-31     3      8      6      4
2006-12-31     6      8      6      4
2007-12-31     9      12     8      4 
2008-12-31     12     16     6      3

CodePudding user response:

Assuming df1 and df2 as dataframe names, you can use index alignment to directly multiply the column names. For this, you first need to reshape both dataframes:

new_df = df2.set_index('date').mul(df1.set_index('name')['value'])

output:

            adam  bobb  cory  dave
date                              
2005-12-31     4     6     3     8
2006-12-31     4     6     6     8
2007-12-31     4     8     9    12
2008-12-31     3     6    12    16

or:

new_df = (df2.set_index('date')
             .mul(df1.set_index('name')['value'])
             .reset_index()
          )

output:

         date  adam  bobb  cory  dave
0  2005-12-31     4     6     3     8
1  2006-12-31     4     6     6     8
2  2007-12-31     4     8     9    12
3  2008-12-31     3     6    12    16

or (in place modification of df2):

s = df1.set_index('name')['value']
df2[s.index] *= s

CodePudding user response:

You can use pandas itterows to get each row (and index of) of the first dataframe. From there, you can say (with each row discovered):

df2[column_name] = df2[column_name] * df.loc[[index]][value] 
  • Related