Home > database >  Pandas, data monotonic behavior
Pandas, data monotonic behavior

Time:02-25

I have a pd.DataFrame similar to the one below:

   C1       C2      
    A  B  C  A  B  C
0 -10 -9 -8 -8 -7 -9
1  -9 -9 -9 -9 -9 -9
2  -8  0 -1 -8  0  0
3   0  1  1  2  3  1

In this dataframe I need to know the monotonicity for each condition (C1, C2) in each row. Basically, I am looking for a way to generate the following mask:

      0     1
0  True  True
1  True  True
2  True  True
3  True  True

The best I have managed, uses apply.

import pandas as pd

dfConcat = []
df = pd.DataFrame({
    ('C1','A'): [-10, -9, -8, 0],
    ('C1','B'): [ -9, -9,  0, 1],
    ('C1','C'): [ -8, -9, -1, 1],
    ('C2','A'): [ -8, -9, -8, 2],
    ('C2','B'): [ -7, -9,  0, 3],
    ('C2','C'): [ -9, -9,  0, 1],
})

idx = pd.IndexSlice
for c in df.columns.unique(0):
    dfConcat.append(df.loc[:,idx[c,:]].apply(lambda x: x.is_monotonic_increasing, axis=1))
                    
mask = pd.concat(dfConcat, axis=1)

print(df)
print('')
print(mask)

I wonder if there is a way to avoid using apply here.

CodePudding user response:

If test monotonic values per rows by Series.is_monotonic_increasing, need apply. Your solution should be simplify:

mask = (df.groupby(axis=1, level=0)
          .apply(lambda x: x.apply(lambda x: x.is_monotonic_increasing, axis=1)))
print(mask)
      C1     C2
0   True  False
1   True   True
2  False   True
3   True  False

Alternative with test imonotonic increasing in numpy:

f = lambda x: pd.Series(np.all(np.diff(x, axis=1) >= 0, axis=1))
mask = df.groupby(axis=1, level=0).apply(f)
print(mask)
      C1     C2
0   True  False
1   True   True
2  False   True
3   True  False
  • Related