Home > OS >  How to make two dataframes one with values and another with boolean into one dataframe in python?
How to make two dataframes one with values and another with boolean into one dataframe in python?

Time:07-12

For example, I have two dataframes like:

dataframe1 would be

            A     B     C     D     E
values1  0.25  0.33  0.12  0.22  0.08
values2  0.20  0.50  0.89  0.65  0.75

and dataframe2 would be

              A     B     C      D     E
boolean1   True False  True  False  True
boolean2  False False  True   True  True

and want a result with one dataframe:

      A  B     C     D     E
1  0.25  0  0.12     0  0.08  
2     0  0  0.89  0.65  0.78

So if it is True in dataframe2 just bring the value from the dataframe1 and if it is False then replace with 0. How can I do this?

CodePudding user response:

You can use

df1 = df1.where(df2.values, 0)
# or
df1 = df1.mask(~df2.values, 0)
print(df1)

            A    B     C     D     E
values1  0.25  0.0  0.12  0.00  0.08
values2  0.00  0.0  0.89  0.65  0.75

CodePudding user response:

One way is to just multiply the values in the dataframe, then create the dataframe out of it.

>>> out = pd.DataFrame(df1.values * df2.values, columns=df1.columns)

OUTPUT

      A    B     C     D     E
0  0.25  0.0  0.12  0.00  0.08
1  0.00  0.0  0.89  0.65  0.75

Or you can just multiply the dataframes dropping the indices:

>>> df1.reset_index(drop=True)*df2.reset_index(drop=True)

      A    B     C     D     E
0  0.25  0.0  0.12  0.00  0.08
1  0.00  0.0  0.89  0.65  0.75

CodePudding user response:

You can use pandas.DataFrame.mask if its guaranteed that both dataframe will always be of same shape.

creating the data

df1 = pd.DataFrame([['values1', '0.25', '0.33', '0.12', '0.22', '0.08'], ['values2', '0.20', '0.50', '0.89', '0.65', '0.75']], columns = ['index', 'A', 'B', 'C', 'D', 'E']).set_index('index')

df2 = pd.DataFrame([['boolean1', True, False,  True,  False,  True], ['boolean2',  False, False,  True,   True,  True]], columns = ['index', 'A', 'B', 'C', 'D', 'E']).set_index('index')

Mask

df1.mask(~df2.values, 0)

Output

This gives us

            A  B     C     D     E
index                             
values1  0.25  0  0.12     0  0.08
values2     0  0  0.89  0.65  0.75
  • Related