python - if-else in a for loop processing one column-CodePudding

I am interested to loop through column to convert into processed series.
Below is an example of two row, four columns data frame:

import pandas as pd
from rapidfuzz import process as process_rapid
from rapidfuzz import utils as rapid_utils

data = [['r/o ac. nephritis.  /.  nephrotic syndrome', ' ac. nephritis.  /.  nephrotic syndrome',1,'ac   nephritis      nephrotic syndrome'], [ 'sternocleidomastoid contracture','sternocleidomastoid contracture',0,"NA"]]

# Create the pandas DataFrame

df_diagnosis = pd.DataFrame(data, columns = ['diagnosis_name', 'diagnosis_name_edited','is_spell_corrected','spell_corrected_value'])

I want to use spell_corrected_value column if is_spell_corrected column is more than 1. Else, use diagnosis_name_edited

At the moment, I have following code to directly use diagnosis_name_edited column. How do I make into if-else/lambda check for is_spell_corrected column?

unmapped_diag_series = (rapid_utils.default_process(d) for d in df_diagnosis['diagnosis_name_edited'].astype(str)) # characters (generator)
unmapped_processed_diagnosis = pd.Series(unmapped_diag_series) #

Thank you.

CodePudding user response：

If I get you right, try out this fast solution using numpy.where:

df_diagnosis['new_column'] = np.where(df_diagnosis['is_spell_corrected'] > 1, df_diagnosis['spell_corrected_value'], df_diagnosis['diagnosis_name_edited'])