How to get first n characters from another column that doesn't contain specific characters-CodePudding

I have this dataframe

ID	product name
1BJM10	1BJM10_RS2022_PK
	L_RS2022_PK
	2PKL10_RS2022_PK
	3BDG10_RS2022_PK
1BJM10	1BJM10_RS2022_PK

My desired output is like this

ID	product name
1BJM10	1BJM10_RS2022_PK
-	L_RS2022_PK
2PKL10	2PKL10_RS2022_PK
3BDG10	3BDG10_RS2022_PK
1BJM10	1BJM10_RS2022_PK

2nd row shouldn't get the ID because is has "_" in the product name's first 6 characters.

I have tried this code, but it doesn't work

df.loc[df['ID'].isna()] = df['ID'].fillna(~df['product name'].str[:6].contains("_"))

CodePudding user response：

Chain both conditions by & for bitwise AND with helper Series:

s = df['product name'].str[:6]
df.loc[df['ID'].isna() & ~s.str.contains("_"), 'ID'] = s
print (df)
       ID      product name
0  1BJM10  1BJM10_RS2022_PK
1     NaN       L_RS2022_PK
2  2PKL10  2PKL10_RS2022_PK
3  3BDG10  3BDG10_RS2022_PK
4  1BJM10  1BJM10_RS2022_PK

CodePudding user response：

Try:

df['ID'] = df['product name'].apply(lambda x: x[:x.find('_')] if x.find('_')>=6 else '')