I have a dataframe
Name SubName
AB ABCD
UI 10UI09
JK 89-JK-07
yhk 100yhk0A
I need a column added mentioning the characters in SubName which are not in Name.
Name SubName Remainder
AB ABCD CD
UI 10UI09 1009
JK 89-JK-07 89--07
yhk 100yhk0A 1000A
CodePudding user response:
You need to use a loop here, you can use a regex:
import re
df['Remainder'] = [re.sub(f'[{"".join(set(a))}]', '', b)
for a,b in zip(df['Name'], df['SubName'])]
Alternative with join
and set
(could be faster in some cases):
df['Remainder'] = [''.join([c for c in b if c not in S])
if (S:=set(a)) else b
for a,b in zip(df['Name'], df['SubName'])
]
output:
Name SubName Remainder
0 AB ABCD CD
1 UI 10UI09 1009
2 JK 89-JK-07 89--07
3 yhk 100yhk0A 1000A
CodePudding user response:
You can also use apply to get the new columns, like this:
df["Remainder"] = df.apply(lambda x: (x["SubName"].replace(x["name"], "")), axis=1)
Output:
name SubName Remainder
AB ABCD CD
UI 10UI09 1009
JK 89-JK-07 89--07
yhk 100yhk0A 1000A