I have a dataframe as below:
Car Code DOMAINS_616
Proton CFC_HIC_U1 PRA_U1|PRD_U1|CFC_HIC_U1|CFC_HPC_U1|CFC_HIC_U1
Perudua CFC_HVC_U1 PRA_U1|PRD_U1|CFC_HVC_U1|CFC_HPC_U1|CFC_HIC_U1
Toyota CFC_HIC_U1 PRA_D1|PRD_D1|CFC_HKC_U1|CFC_HPC_U1|CFC_HOC_U1
Proton CFC_HPC_U1 PRE_U1|PRD_U1|CFC_HPC_U1|CFC_HPC_U1|CFC_HOC_U1
Based on the code column I wish to know the position number at DOMAINS_616 which will seperate by | , expected output as below:
Car Code DOMAINS_616 Position
Proton CFC_HIC_U1 PRA_U1|PRD_U1|CFC_HIC_U1|CFC_HPC_U1|CFC_HIC_U2 3
Perudua CFC_HVC_U1 PRA_U1|PRD_U1|CFC_HVC_U1|CFC_HPC_U1|CFC_HIC_U1 3
Toyota CFC_HIC_U1 PRA_D1|PRD_D1|CFC_HKC_U1|CFC_HPC_U1|CFC_HIC_U1 5
Proton CFC_HPC_U1 PRE_U1|PRD_U1|CFC_HLC_U1|CFC_HPC_U1|CFC_HOC_U1 4
Anyone have ideas?
CodePudding user response:
if Code
appears once and only once in DOMAINS_616
,
first we split DOMAINS_616
into a list and explode it
step1 = df.assign(dummy=df['DOMAINS_616'].apply(lambda x: list(enumerate(x.split('|'), 1)))).explode('dummy')
then we expand the dummy
column into the Position
and domain
column
step1[['Position', 'domain']] = step1['dummy'].apply(pd.Series)
finally keep rows where Code
matches with domain
step1[step1['Code']==step1['domain']].drop(columns=['dummy', 'domain'])
CodePudding user response:
df['Position'] = df.apply(lambda row: row['DOMAINS_616'].split(|).index(row['Code']) 1, axis = 1)