Home > Blockchain >  Creating and assigning values of a new Pandas DataFrame column
Creating and assigning values of a new Pandas DataFrame column

Time:06-04

Problem Statement-Create a new categorical variable with value as Open and Closed. Open & Pending is to be categorized as Open and Closed & Solved is to be categorized as Closed.

Description: I have a dataframe where there is a column 'Status' with the following values

0       Closed
1       Closed
2       Closed
3         Open
4       Solved
         ...  
2219    Closed
2220    Solved
2221    Solved
2222    Solved
2223      Open

Now I am supposed to create another column based on the column status as mentioned above but if the value of Status is 'Open' or 'Pending' then the new column 'Final Status' should have a value of 'Open' and similarly if the value of Status is 'Closed' or 'Pending' then the new column 'Final Status' should have a value of 'Closed'. I tried applying the following code to do so but it doesn't work and gives me the following incorrect results. Telecom is the dataframe

for i in Telecom['Status']:
    if i =='Open' or i =='Pending':
        Telecom['Final Status']= Telecom['Status'].replace(['Open','Pending'],'Open')
    elif i =='Closed' or i =='Solved':
        Telecom['Final Status']= Telecom['Status'].replace(['Closed','Solved'],'Closed')

The result for the column 'Final Status' is as follows: Status Final Status 0 Closed Closed 1 Closed Closed 2 Closed Closed 3 Open Open 4 Solved Solved

I can't figure out where I am going wrong. Seems its just copying the values from 'Status' and putting it in 'Final Status'

CodePudding user response:

IIUC you can use a np.select() to get what you are looking for

import numpy as np
condition_list = [(df['Status'] == 'Open') | (df['Status'] == 'Pending'), (df['Closed'] == 'Open') | (df['Solved'] == 'Pending')]
choice_list = ['Open', 'Closed']
df['Final Status'] = np.select(condition_list, choice_list, '')

CodePudding user response:

You could just use a simple map:

d = {'Open': 'Open', 'Pending': 'Open', 'Closed': 'Closed', 'Solved': 'Closed'}

df['Final Status'] = df['Status'].map(d)

or:

d = {'Pending': 'Open', 'Closed': 'Closed'}

df['Final Status'] = df['Status'].map(lambda x: d.get(x, x))
# or
# df['Final Status'] = df['Status'].map(d).fillna(df['Status'])

output:

      Status Final Status
0     Closed       Closed
1     Closed       Closed
2     Closed       Closed
3       Open         Open
4     Solved       Closed
2219  Closed       Closed
2220  Solved       Closed
2221  Solved       Closed
2222  Solved       Closed
2223    Open         Open
  • Related