Like Left formula we Used in Excel same as I want to extract no. of character from Policy no. column based on Insurer column like.....
if insurer is HDFC then Extract only 10 character form the sting and if insurer is tata then Extract only 7 character form the sting.
How I will achieve this in python
Insurer | Policy no. | Expected OutPut |
---|---|---|
Hdfc | 4509242332 | 4509242332 |
Tata | tatadigitNational | tatadig |
Hdfc | 09082323ab12sd | 09082323ab |
Hdfc | nolanheroman | nolanherom |
Tata | 97543007356 | 9754300 |
Tata | pqrsequence2o202 | pqrsequ |
Tata | 987654321 | 9876543 |
CodePudding user response:
What you can do is define a new function that does the comparison and apply that to a new column (Expected OutPut
in your example).
def f(row):
val = str(row['Policy no.'])
return val[:10] if row['Insurer'] == "Hdfc" else val[:7]
df['Expected OutPut'] = df.apply(f, axis=1)
CodePudding user response:
You can try np.select
df['out'] = np.select(
[df['Insurer'].str.lower().eq('hdfc'),
df['Insurer'].str.lower().eq('tata')],
[df['Policy no.'].str[:10],
df['Policy no.'].str[:7],],
df['Policy no.']
)
print(df)
Insurer Policy no. Expected OutPut out
0 Hdfc 4509242332 4509242332 4509242332
1 Tata tatadigitNational tatadig tatadig
2 Hdfc 09082323ab12sd 09082323ab 09082323ab
3 Hdfc nolanheroman nolanherom nolanherom
4 Tata 97543007356 9754300 9754300
5 Tata pqrsequence2o202 pqrsequ pqrsequ
6 Tata 987654321 9876543 9876543
CodePudding user response:
One possible solution is,
df['temp'] = df['Insurer'].map({'Hdfc':10, 'Tata':7})
df['Expected Output'] = df.apply(lambda x: x['Policy no.'][:x['temp']], axis=1)
O/P:
Insurer Policy no. Expected Output temp
0 Hdfc 4509242332 4509242332 10
1 Tata tatadigitNational tatadig 7
2 Hdfc 09082323ab12sd 09082323ab 10
3 Hdfc nolanheroman nolanherom 10
4 Tata 97543007356 9754300 7
5 Tata pqrsequence2o202 pqrsequ 7
6 Tata 987654321 9876543 7
CodePudding user response:
Another solution:
df['Expected OutPut'] = df.apply(lambda x: x['Policy no.'][0:10] if x['Insurer']=='Hdfc' else x['Policy no.'][0:7], axis = 1)
print(df)
Insurer Policy no. Expected OutPut
0 Hdfc 4509242332 4509242332
1 Tata tatadigitNational tatadig
2 Hdfc 09082323ab12sd 09082323ab
3 Hdfc nolanheroman nolanherom
4 Tata 97543007356 9754300
5 Tata pqrsequence2o202 pqrsequ
6 Tata 987654321 9876543
CodePudding user response:
You could try this
df['Expected Output'] = np.where(df['Insurer']== 'Hdfc', df["Policy no"].str[:10],df["Policy no"].str[:7])
CodePudding user response:
A vector solution using pandas slicing
df['Expected Output'] = df['Policy no.'].str[:10]
df.loc[df.index[df.Insurer.eq('Tata')], 'Expected Output'] = df['Expected Output'].loc[df.index[df.Insurer.eq('Tata')]].str[:7]
which gives us the expected output :
df
Insurer Policy no. Expected Output
0 Hdfc 4509242332 4509242332
1 Tata tatadigitNational tatadig
2 Hdfc 09082323ab12sd 09082323ab
3 Hdfc nolanheroman nolanherom
4 Tata 97543007356 9754300
5 Tata pqrsequence2o202 pqrsequ
6 Tata 987654321 9876543
CodePudding user response:
You can easily do this by converting the dataframe to an array, then iterating through every row
array = df.to_numpy()
for row in array:
#assuming you only have 2 columns, check to see if the insurer is Tata
if row[0] == 'Tata':
#slice string in the Policy column
row[1] = row[1][:7]
#now, convert array back to df
pd.Dataframe(array, columns=['Insurer','Policy no.'])