I have data frame(location)
as shown below. I have also pasted my current code below but it filters out all record containing numbers and specials characters.
My issue lies when there is a space character between words eg NEWYORK CITY , NEW YORK CITY . I dont filter out space character between words .
INPUT
location.head(8)
CITY COUNTRY
AGNIN34 FR
(REYDON) GB
MARSCIANO IT
SANXIANG TOWN CN
SIZIANO IT
APELDOORN NL
REYDON GB
NEWYORK CITY US
My current code:
out = location[location.apply(lambda c: c.str.match('(?i)[a-z] $')).all(1)]
Expected Output
CITY COUNTRY
MARSCIANO IT
SANXIANG TOWN CN
SIZIANO IT
APELDOORN NL
REYDON GB
NEWYORK CITY US
How can this be done?
CodePudding user response:
Check
out = location[location.CITY.astype(str).str.match('^[a-zA-Z ]*$')]
CodePudding user response:
Use str.contains
along with the na=False
flag set:
out = location[location["CITY"].str.contains(r'^[A-Za-z ] $', na=False, regex=True)]