Home > Blockchain >  How to insert NaN if value is not between fence_high and fence_low columns
How to insert NaN if value is not between fence_high and fence_low columns

Time:06-27

I have to replace the values from the first three columns with NaN if they are >= than fence_high or <= than fence_low.

I have a dataframe like this:

   col1  col2  col3  fence_high  fence_low
0     1     3     9           9        1.5
1     2     4     6           7          1
2     4     7    -1         6.5          0

This is what I would like to achieve:

   col1  col2  col3  fence_high  fence_low
0   NaN     3   NaN           9        1.5
1     2     4     6           7          1
2     4   NaN   NaN         6.5          0

So far I tried df_new = df[(df < df["fence_high"]) & (df > df["fence_low"])], but this gives me all NaN.

CodePudding user response:

We can simply keep values where they fall between fence_low and fence_high using gt and lt to maintain index alignment:

df.loc[:, 'col1':'col3'] = df.loc[:, 'col1':'col3'].where(
    lambda x: x.gt(df['fence_low'], axis=0) & x.lt(df['fence_high'], axis=0)
)

df

   col1  col2  col3  fence_high  fence_low
0   NaN   3.0   NaN         9.0        1.5
1   2.0   4.0   6.0         7.0        1.0
2   4.0   NaN   NaN         6.5        0.0

If needing a new DataFrame we can join after where to restore the columns that were not considered:

new_df = df.loc[:, 'col1':'col3'].where(
    lambda x: x.gt(df['fence_low'], axis=0) & x.lt(df['fence_high'], axis=0)
).join(df[['fence_high', 'fence_low']])

new_df:

   col1  col2  col3  fence_high  fence_low
0   NaN   3.0   NaN         9.0        1.5
1   2.0   4.0   6.0         7.0        1.0
2   4.0   NaN   NaN         6.5        0.0

CodePudding user response:

One of the ways is to use apply
See if this helps:

import pandas as pd
import numpy as np

cols_list = ["col1", "col2", "col3"]


def compare_val(val, high, low):
    if val >= high or val <= low:
        return np.nan
    return val


def compare(row):
    result = []
    for i in cols_list:
        result.append(
            compare_val(val=row[i], high=row["fence_high"], low=row["fence_low"])
        )
    return pd.Series(result)


data = [[1, 3, 9, 9, 1.5], [2, 4, 6, 7, 1], [4, 7, -1, 6.5, 0]]
df = pd.DataFrame(data, columns=[*cols_list, "fence_high", "fence_low"])
print("Original:\n", df.head())
df[cols_list] = df.apply(compare, axis=1)

print("Transformed:\n", df.head())

Output:

Original:
    col1  col2  col3  fence_high  fence_low
0     1     3     9         9.0        1.5
1     2     4     6         7.0        1.0
2     4     7    -1         6.5        0.0
Transformed:
    col1  col2  col3  fence_high  fence_low
0   NaN   3.0   NaN         9.0        1.5
1   2.0   4.0   6.0         7.0        1.0
2   4.0   NaN   NaN         6.5        0.0
  • Related