Home > Enterprise >  How to get first n characters from another column that doesn't contain specific characters
How to get first n characters from another column that doesn't contain specific characters

Time:11-30

I have this dataframe

ID product name
1BJM10 1BJM10_RS2022_PK
L_RS2022_PK
2PKL10_RS2022_PK
3BDG10_RS2022_PK
1BJM10 1BJM10_RS2022_PK

My desired output is like this

ID product name
1BJM10 1BJM10_RS2022_PK
- L_RS2022_PK
2PKL10 2PKL10_RS2022_PK
3BDG10 3BDG10_RS2022_PK
1BJM10 1BJM10_RS2022_PK

2nd row shouldn't get the ID because is has "_" in the product name's first 6 characters.

I have tried this code, but it doesn't work

df.loc[df['ID'].isna()] = df['ID'].fillna(~df['product name'].str[:6].contains("_"))

CodePudding user response:

Chain both conditions by & for bitwise AND with helper Series:

s = df['product name'].str[:6]
df.loc[df['ID'].isna() & ~s.str.contains("_"), 'ID'] = s
print (df)
       ID      product name
0  1BJM10  1BJM10_RS2022_PK
1     NaN       L_RS2022_PK
2  2PKL10  2PKL10_RS2022_PK
3  3BDG10  3BDG10_RS2022_PK
4  1BJM10  1BJM10_RS2022_PK

CodePudding user response:

Try:

df['ID'] = df['product name'].apply(lambda x: x[:x.find('_')] if x.find('_')>=6 else '')
  • Related