Home > Software engineering >  Nested for loop with if in one line for creating new column in dataframe
Nested for loop with if in one line for creating new column in dataframe

Time:10-21

I have a dataframe "result" and want to create a new column called "type". The value in "type" will be the item value of a dict if the column "Particulars" in the dataframe contains value of the key.

dict_classify={'key1': 'content1', 
           'key2':'content2'
          }

result['type']=[dict_classify[key] if key.lower() in i.lower() else np.nan 
                for key in dict_classify.keys() 
                for i in result['Particulars']]

It returns error "Length of values (5200) does not match length of index (1040)". Any idea what i did wrong?

CodePudding user response:

It sounds like each entry i of result["Particulars"] is either a key from dict_classify in which case we want the corresponding entry of result["type"] to be dict_classify[i], or is not a key in which case we want the corresponding entry to be NaN. If that's the case, then you should have something like

result['type'] = [dict_classify.get(i,np.nan) for i in result['Particulars']]

The same result could more efficiently be attained with

result['type'] = result['Particulars'].apply(lambda i: dict_classify.get(i,np.nan))

CodePudding user response:

result['type'] = [dict_classify.get(i,np.nan) for i in result['Particulars']]

The same result could more efficiently be attained with

result['type'] = result['Particulars'].apply(lambda i: dict_classify.get(i,np.nan))

  • Related