replace values by different conditions in a dataframe-CodePudding

I have a dataframe like this:

df_test = pd.DataFrame({'ID1':['A','B','C','BA','BA','AB','>','>','>','>'],
                       'ID2':['','','','','','','mh','mh','nn','nn']})
df_test

I want to obtain a dataframe like this based on the column 'ID1'(1. if len(ID1)>2: then ID1=ID1[-1](for example 'BA', 'AB' will be replaced with 'A', 'B', respectively); 2. if ID1='>': then ID1=ID2(for example: '>' will be replaced with 'mh','nn',respectively)):

df_result = pd.DataFrame({'ID1':['A','B','C','A','A','B','mh','mh','nn','nn']})
df_result

CodePudding user response：

Use str accessor:

out = df['ID1'].str[-1].replace('>', np.nan).fillna(df['ID2']).to_frame()
print(out)

# Output
  ID1
0   A
1   B
2   C
3   A
4   A
5   B
6  mh
7  mh
8  nn
9  nn

CodePudding user response：

You can use .str[-1] regardless of the length of the strings in the column to select the last character, and use <column>.where(cond, other_col) to fill in values that don't match cond with those values from other_col:

df_test['ID1'] = df_test.assign(ID1=df_test['ID1'].str[-1]).pipe(lambda x: x['ID1'].where(x['ID1'] != '>', x['ID2']))

CodePudding user response：

You can try using np.where:

import pandas as pd
import numpy as np

df_test = pd.DataFrame({'ID1':['A','B','C','BA','BA','AB','>','>','>','>'],
                       'ID2':['','','','','','','mh','mh','nn','nn']})

df_test['ID1'] = np.where(df_test['ID1'].str.len()>2, df_test['ID1'].str[-1], df_test['ID1'].str[-1])
df_test['ID1'] = np.where(df_test['ID1'] == '>', df_test['ID2'], df_test['ID1'])

df_test = df_test.drop('ID2', axis=1)
print(df_test)

  ID1
0   A
1   B
2   C
3   A
4   A
5   B
6  mh
7  mh
8  nn
9  nn

CodePudding user response：

Let us do mask

df_test.ID1.mask(df_test.ID1.eq('>'),df_test.ID2,inplace=True)
df_test
Out[217]: 
  ID1 ID2
0   A    
1   B    
2   C    
3  BA    
4  BA    
5  AB    
6  mh  mh
7  mh  mh
8  nn  nn
9  nn  nn