Home > Net >  Removing strings based on other strings in pandas
Removing strings based on other strings in pandas

Time:11-03

Column A Column B
JELLY ROSEN 15 JELLY STREET APT 111 JERSEY CITY, NJ 15 JELLY STREET APT 111 JERSEY CITY, NJ
ROB ROSEN & STEVEN ROSEN 29 MAIN ST JERSEY CITY, NJ 29 MAIN ST JERSEY CITY, NJ

I need the following: |** Column C ** | | JELLY ROSEN| | ROB ROSEN & STEVEN ROSEN|

I've tried something like this:

def address_name(COLUMN A, COLUMN B):
    return df['COLUMN A'][:df['COLUMN A'][0].find(df['COLUMN B'][0])]

CodePudding user response:

You can try the below function to get the values into column c

def replace_values(x):
    cola, colb = x
    return cola.replace(colb,'')

df['Column C'] = df[['Column A', 'Column B']].apply(replace_values, axis=1)

CodePudding user response:

You can use regex :

import re
df["COLUMN C"] = list(map(lambda x: re.match(r"([^0-9] )",x,re.I).groups()[0][:-1],df["COLUMN A"]))
  • Related