Home > database >  How to check if values in column are NOT null and then output a specific value to another column usi
How to check if values in column are NOT null and then output a specific value to another column usi

Time:07-20

I am currently writing a function in pandas to try to check rows in a column to see if they are not null. If they are not null, I want something to be outputed to a new column and for this case it would be 'Financing'. Basically if a row has a value for loan funded date, I want the phrase Financing to be printed to a new column called Payment Type. On top of that, I have a column called Claim Approved Date that I want the same thing to be done but output 'Insurance'. Lastly, if both columns are null I would like Cash/Credit to be outputed to Payment Type. For some reason I cannot get this to work. I want it done as a function but I can't seem to perform my desired outcome using this method.

def typepayment(x):
    m = x['Loan Funded Date'].notna()
    b = x['Claim Approved Date'].notna()
    for a in m:
        if a == True:
            x['Payment Type'] = 'Financing'
        for d in b:
            if d == True:
                x['Payment Type'] = 'Insurance'
    #     If loan funded date is false and claim approved date is false
    #             Payment Type is Cash/Credit
            
        
        
    return x
    
    df2 = df.apply(typepayment, axis = 1)
    df2

This is what I got so far but I have a feeling it is extremely off. Any pointers would be very appreciated! Thank you so much.

CodePudding user response:

Instead of iterating over the rows you can use:

Edit: Just read the last part about 'Cash/Credit' when both are null.

import pandas as pd
import numpy as np
df  = pd.DataFrame({'Loan Funded Date': [1, np.nan, np.nan], 'Claim Approved Date': [np.nan, 1, np.nan]})
df['Payment Type'] = (
    df['Loan Funded Date']
    .notnull()
    .replace({True: 'Financing', False: np.nan})
    .combine_first(
        df['Claim Approved Date']
        .notnull()
        .replace({True: 'Insurance', False: 'Cash/Credit'})
    )
)

Output:

   Loan Funded Date  Claim Approved Date Payment Type
0               1.0                  NaN    Financing
1               NaN                  1.0    Insurance
2               NaN                  NaN  Cash/Credit

CodePudding user response:

IIUC, there are two approaches

1. Using Apply

Note: apply is not an inplace function - it does not modify values in the original object, so you need to assign it back

import numpy as np
import pandas as pd

def typepayment(x):
    is_na = x.isna()        # Mask to see which fields have NaN
    if not is_na['Loan Funded Date']:
        return "Financing"
    elif not is_na['Claim Approved Date']:
        return "Insurance"
    else:
        return 'Cash/Credit'

# Usage
df['Payment_Type'] = df.apply(typepayment, axis = 1)

Test

data = {'Loan Funded Date':["01-02-2022", np.nan, np.nan],
        'Claim Approved Date':['03--17-2022', '04-15-2022', np.nan]}

df = pd.DataFrame(data)
df['Payment_Type'] = df.apply(typepayment, axis = 1)

print(df)

Output

    Loan Funded Date    Claim Approved Date Payment_Type
0   01-02-2022          03--17-2022         Financing
1   NaN                 04-15-2022          Insurance
2   NaN                 NaN                 Cash/Credit

2. Using numpy.where

Note: This method is faster than apply on large datasets

df['Payment_Type'] = np.where(~df['Loan Funded Date'].isna(), 
                              "Financing", 
                              np.where(~df['Claim Approved Date'].isna(), "Insurance", "Cash/Credit"))

Test - Output same as Method 1

  • Related