Home > OS >  Pandas: Based on values in column A populate values in column B accordingly without iteration.(for l
Pandas: Based on values in column A populate values in column B accordingly without iteration.(for l

Time:07-27

Here is my problem statement -

I have columnA with data set like

import pandas as pd
df = pd.DataFrame({
    'columnA': ['DD22HAHTL1NXX---', 'DD22HATNT1N--D3F', 'DD22HATNT1N--B3F', 'DD22HAHTL1N--A3F', 'DD22HATNT1N--C1F', 'DD22HAHTL1N--A3F', 'DD22HATNT1N--B3F', 'DD22HAHTL1N--A3F', 'DD22HAHTL1N--A3E', 'DD22HAHTL1N--A3F', 'DD22HAHTL1N--B3F', 'DD22HAHTL1N--A3F', 'DD22HAHTL1N--A3F', 'DD22HAHTL1NZZ---', 'DD22HAHTL1N--A3E']})

I am trying to create a new columnB with the substring of columnA but with a catch. The condition is,

If the last 3 characters of each row is --- then I need to extract XX , i.e 12th and 13th characters and add it in new columnB else I need to capture the last 3 characters and add them to columnB . My desired output will look like this -

output

How can i create this new columnB efficiently without using any loops?

I tried this and worked for me for only 1 case:

df['columnB'] = df.columnA.str[-3:]
df[columnB] = df.['columnA'].str.extract('([A-Za,z]{2}---)', expand=True) 

This gave me last 3 charecters of all the columns which are not ending with '---' but with the column ending with '---' , I got it as is which is wrong. I also tried str.extract with regex but it gave me correct values for all the rows ending with '---' but not the others. Please help.

CodePudding user response:

You could use np.where

import numpy as np

df['columnB'] = np.where(df['columnA'].str.endswith('---'), df['columnA'].str.slice(11,13), df['columnA'].str.slice(-3))
print(df)

Result

             columnA columnB
0   DD22HAHTL1NXX---      XX
1   DD22HATNT1N--D3F     D3F
2   DD22HATNT1N--B3F     B3F
3   DD22HAHTL1N--A3F     A3F
4   DD22HATNT1N--C1F     C1F
5   DD22HAHTL1N--A3F     A3F
6   DD22HATNT1N--B3F     B3F
7   DD22HAHTL1N--A3F     A3F
8   DD22HAHTL1N--A3E     A3E
9   DD22HAHTL1N--A3F     A3F
10  DD22HAHTL1N--B3F     B3F
11  DD22HAHTL1N--A3F     A3F
12  DD22HAHTL1N--A3F     A3F
13  DD22HAHTL1NZZ---      ZZ
14  DD22HAHTL1N--A3E     A3E
  • Related