Replace the column value with certain conditions-CodePudding

My dataframe looks like the below and name of the dataframe is "Assert_data"

System_Name  Type
ABC          Test-3451
BCS          Dev-2323
ASD          Dev-1213
FGS          Prod-1212
ASC          Tes-1244

Need help to replace the column value , If the Type column has a value as "Prod" , It should be replaced to "Production"

Likewise for Dev ==> Development & Tes ==> Test

I tired the following coding , But it throws "ValueError: either both or neither of x and y should be given"

df['Instance_type'] = np.where(Assert_data['Type'].str.contains("Prod"), "Production",
                                np.where(Assert_data['Type'].str.contains("Dev"), "Development",
                                np.where(Assert_data['Type'].str.contains("Tes"), "Test")))

Tried using pd.np.where - Same error message.

CodePudding user response：

For last np.where is missing value y - if not match any condition:

Assert_data['Instance_type'] = np.where(Assert_data['Type'].str.contains("Prod"), "Production",
                               np.where(Assert_data['Type'].str.contains("Dev"), "Development",
                               np.where(Assert_data['Type'].str.contains("Tes"), "Test", None)))

print (Assert_data)
  System_Name       Type Instance_type
0         ABC  Test-3451          Test
1         BCS   Dev-2323   Development
2         ASD   Dev-1213   Development
3         FGS  Prod-1212    Production
4         ASC   Tes-1244          Test

Or use numpy.select:

Assert_data['Instance_type'] = np.select([Assert_data['Type'].str.contains("Prod"),
                                          Assert_data['Type'].str.contains("Dev"),
                                          Assert_data['Type'].str.contains("Tes")],
                                         [ "Production","Development","Test"],
                                         default=None)

Or use Series.str.extract with Series.map:

d = {'Prod': "Production","Dev":"Development","Tes":"Test"}

Assert_data['Instance_type'] = Assert_data['Type'].str.extract(f'({"|".join(d)})', expand=False).map(d)

CodePudding user response：

You could use a dictionary to craft a regex for str.replace:

repl = {'Tes': 'Test', 'Prod': 'Production', 'Dev': 'Development'}

df['Type2'] = df['Type'].str.replace(fr"({'|'.join(repl)})\b",
                                     lambda x: repl.get(x.group(0)),
                                     regex=True)

or for a full replacement:

df['Type3'] = df['Type'].str.replace(fr"^({'|'.join(repl)}).*$",
                                     lambda x: repl.get(x.group(1)),
                                     regex=True)

You could even map from only the first letter (or few letters) if this is discriminant:

repl = ['Test', 'Production', 'Development']

df['Type3'] = df['Type'].str.replace(fr"^({'|'.join({x[0]: x for x in repl})}).*$",
                                     lambda x: repl.get(x.group(1)),
                                     regex=True)

output:

  System_Name       Type             Type2        Type3
0         ABC  Test-3451         Test-3451         Test
1         BCS   Dev-2323  Development-2323  Development
2         ASD   Dev-1213  Development-1213  Development
3         FGS  Prod-1212   Production-1212   Production
4         ASC   Tes-1244         Test-1244         Test