Home > OS >  How to modify function so as to return 2 DataFrame depending on values in Python Pandas?
How to modify function so as to return 2 DataFrame depending on values in Python Pandas?

Time:07-07

I have function in Python Pandas like below:

def my_func(df, col: str):
    if pd.isna(df[col]):
          return False

To use my function I need: df_resul = my_func(df = my_df, col = "col1")

And Data Frame like below where col1 is string data type:

col1
--------
NaN
ABC
NaN

How can I modify my function, so as to as a result have 2 different DataFrames:

  1. Where in col1 is NaN
  2. Where in col1 is value other than NaN

So to use my function I need: df_nan, df_not_nan = my_func(df = my_df, col = "col1") where df_nan will return df where in col1 is nan and df_not_nan will return df where in col is value other than nan.

df_nan:

col1
------
NaN
NaN

df_not_nan:

col1
-----
ABC

How can I modify my function in Python Pandas ?

CodePudding user response:

Use boolean indexing with ~ fo rinvert mask, here for select non missing values rows:

print (my_df)
  col1  a
0  NaN  1
1  ABC  2
2  NaN  3

def my_func(df, col: str):
    m = df[col].isna()
    return df[m], df[~m]
 
df_nan, df_not_nan = my_func(df = my_df, col = "col1")
print (df_nan)
  col1  a
0  NaN  1
2  NaN  3

print (df_not_nan)
  col1  a
1  ABC  2

If need test if exist at least one missing value is necesary add Series.any for avoid error

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

def my_func1(df, col: str):
    if pd.isna(df[col]).any():
        return  'exist at least one missing values'
    else:
        return 'no missing values'

 
out = my_func1(df = my_df, col = "col1")
print (out)
exist at least one missing values
  • Related