Home > OS >  Pandas DataFrame subselection
Pandas DataFrame subselection

Time:10-31

The code below shows what I would like to do. Is it possible to do it without the iteritems iteration?

Basically I have a DataFrame df with a MultiIndex, and I have a smaller, boolean DataFrame sdf, without the MultiIndex. I would to use sdf as a preselection/mask to set values in df.

mi = pd.MultiIndex.from_product([['A', 'B'], [1, 2]]) 
df = pd.DataFrame([[0, 1, 2, 3], [1, 2, 3, 4]], columns=mi, index=[0, 1]) 
print(df) 
sdf = pd.DataFrame([[True, False], [False, True]], columns=['A', 'B']) 
print(sdf)
for name, value in sdf.iteritems():
        df.loc[value, name] = 10
print(df)

output:

   A     B   
   1  2  1  2
0  0  1  2  3
1  1  2  3  4

       A      B
0   True  False
1  False   True

    A       B    
    1   2   1   2
0  10  10   2   3
1   1   2  10  10

Thank you!

CodePudding user response:

Use DataFrame.mask with DataFrame.reindex:

df = df.mask(sdf.reindex(df.columns, axis=1, level=0), 10)
print(df)
    A       B    
    1   2   1   2
0  10  10   2   3
1   1   2  10  10

Details:

print(sdf.reindex(df.columns, axis=1, level=0))
       A             B       
       1      2      1      2
0   True   True  False  False
1  False  False   True   True
  • Related