Home > Back-end >  Handling nan rows
Handling nan rows

Time:07-27

I need to modify the data but I need to exclude the nans rows. Once the data has been modified I need to put back the nans in the data. What I have so far is I separated the data by no-nans and nans df and then after the modifications I'm using concat to bring the data back together. I'm hoping to see if there is a better way to do it, concat adds the df at the bottom, even though that's the case, there might be some cases where that's not true. I was hoping to add the nans back to their original position rather than at the bottom.

import pandas as pd
import numpy as np


def modify_data():
    d = {'num': [1, 2, 3, 4, np.nan], 'n_obs': [3, 4, 2, 3, 1], 'target': [3, 4, 5, 2, 7]}
    df = pd.DataFrame(data=d)
    nan_df = df[df["num"].isnull()]
    not_nan_df = df[df["num"].notnull()]
    df["num"] = pd.concat([not_nan_df["num"].clip(lower=2), nan_df["num"]])
    print(df["num"])
    return df["num"].values

CodePudding user response:

You don't need all of that. Just restrict both sides of the equals sign:

df[df["num"].notnull()] = df[df["num"].notnull()].clip(lower=2)

Output:

   num  n_obs  target
0  2.0      3       3
1  2.0      4       4
2  3.0      2       5
3  4.0      3       2
4  NaN      1       7

CodePudding user response:

According to the documentation, you can use clip without considering NaN:

# Or df['num'].clip(lower=2, inplace=True)
df['num'] = df['num'].clip(lower=2)
print(df)

# Output
   num  n_obs  target
0  2.0      3       3
1  2.0      4       4
2  3.0      2       5
3  4.0      3       2
4  NaN      1       7
  • Related