Home > database >  create a new dataframe out of a existing dataframe, based on conditional value of another dataframe
create a new dataframe out of a existing dataframe, based on conditional value of another dataframe

Time:09-21

i try to generate a new dataframe (expected_result) , which is based on two existing dataframes (rank, value) and a filter. both existing datafames have the same header and index. the new dataframe should show the value of the dataframe "value" if the value in the dataframe "rank" is equal or less than the filter_var. if the ranking is higher than the filter_var = show "nan"

import pandas as pd

#df 1 Rank
ranking = {'Stock A':[1, 1, 1, 1],
           'Stock B':[3, 3, 4, 4],
           'Stock C':[4, 4, 3, 2],
           'Stock D':[2, 2, 2, 3],
           }

rank = pd.DataFrame(ranking)

#df 2 values
values = {'Stock A':[101, 102, 103, 104],
           'Stock B':[99, 99, 99 , 99],
           'Stock C':[99, 98, 100, 103],
           'Stock D':[100, 100, 100, 102],
           }

value = pd.DataFrame(values)

#Filter
filter_var = 2 #better or equal

#expecet result
results = {'Stock A':[101, 102, 103, 104],
           'Stock B':['nan', 'nan', 'nan' , 'nan'],
           'Stock C':['nan', 'nan', 'nan', 103],
           'Stock D':[100, 100, 100, 'nan'],
           }
#example results
expected_result = pd.DataFrame(results)

thanks for help

CodePudding user response:

Since you have the same index, you could achieve that with a simple:

>>> value[rank.le(2)]

   Stock A  Stock B  Stock C  Stock D
0      101      NaN      NaN    100.0
1      102      NaN      NaN    100.0
2      103      NaN      NaN    100.0
3      104      NaN    103.0      NaN

# Create a new DF
res = value[rank.le(2)]

pandas.le compares DataFrames for less than inequality of equality elementwise


To give a bit of description, the code in the brackets returns a DataFrame of booleans, showing True when a value is less than or equal to 2:

>>> rank.le(2)

   Stock A  Stock B  Stock C  Stock D
0     True    False    False     True
1     True    False    False     True
2     True    False    False     True
3     True    False     True    False

This, along with the fact that your DF's share the same index, allows you to select the rows that have the values you want. Hence why:

value[rank.le(2)]

works.

  • Related