Home > Net >  Dataframe conditional replacement with intigers
Dataframe conditional replacement with intigers

Time:05-27

I have a dataframe column like this:

df['col_name'].unique()
>>>array([-1, 'Not Passed, On the boundary', 1, 'Passed, On the boundary',
       'Passed, Unclear result', 'Passes, Unclear result, On the boudnary',
       'Rejected, Unclear result'], dtype=object)

In this column, if an element contains the word 'Passed' as a field or as a substring, then replace the entire field with integer 1 else replace it with integer -1.

Kindly help me with this

CodePudding user response:

You can use .str.contains to check if value contains string and fill the NaN caused by integer value to False. Then use np.where to fill the True with 1 and False with 0. If you want to keep the original 1 and -1, you can try np.select.

m1 = df['col_name'].str.contains('Passed').fillna(False)
m2 = df['col_name'].isin([1, -1])

df['col_name_replace_1_-1'] = np.where(m1, 1, -1)
df['col_name_keep_1_-1'] = np.select([m2, m1, ~m1], [df['col_name'], 1, -1], default=df['col_name'])
print(df)

                                  col_name  col_name_replace_1_-1 col_name_keep_1_-1
0                                       -1                     -1                 -1
1              Not Passed, On the boundary                      1                  1
2                                        1                     -1                  1
3                  Passed, On the boundary                      1                  1
4                   Passed, Unclear result                      1                  1
5  Passes, Unclear result, On the boudnary                     -1                 -1
6                 Rejected, Unclear result                     -1                 -1

CodePudding user response:

df['col_name'] = df['col_name'].apply(lambda x: 1 if 'Positive' in x else -1)

This checks each entry in df['col_name'], checks if the string contains 'Positive' and replaces it with 1 or -1 appropriately. This obviously assumes that all entries in this column are str

  • Related