Home > Software engineering >  Apply multi conditional mask to dataframe
Apply multi conditional mask to dataframe

Time:11-12

I need to apply a mask to my sparse matrix df and then convert bools to 1.0, like so:

link = 16.0 
mask = (df<=link) 
# convert lesser values to 1
df = df.where(mask, 1.0)

This works. But now I need to use another condition for masking, like so:

mask = (df<=link) & (df!=0.0)

or:

mask = ((df<=link) & (df!=0.0))

But this throws me the error:

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

EDIT:

df.dtypes print:

0       >f4
1       >f4
2       >f4
3       >f4
4       >f4
       ... 
1853    >f4
1854    >f4
1855    >f4
1856    >f4
1857    >f4
Length: 1858, dtype: object

This is how I get my matrix:

from astropy.io import fits

with fits.open('matrix_CEREBELLUM_large.fits') as data:
    df = pd.DataFrame(data[0].data)

link to matrix:

https://cosmosimfrazza.myfreesites.net/cosmic-web-and-brain-network-datasets


What am I missing?

CodePudding user response:

Looking at the line

brain_mask = (df<=brain_Llink & df<=brain_Llink!=0)

there's two subtle bugs: df <= brain_Llink != 0, and operator precedence: a <= b & c != d takes precedence a <= (b & c) != d but you want (a <= b) & (c != d). So fix with:

brain_mask = ((df <= brain_Llink) & (df != 0))

#or

brain_mask = df.le(brain_Llink) & df.ne(0)

If you get an error about

ValueError: Big-endian buffer not supported on little-endian compiler

which may lead you to this page then this will fix it:

from astropy.io import fits

with fits.open('matrix_CEREBELLUM_large.fits') as data:
                                  # change from big-endian to little-endian
    df = pd.DataFrame(data[0].data.byteswap().newbyteorder())
  • Related