I'm writing a script in Python to transfer Excel Online data to GCP and I would like to replace \xa0 from strings inside column of DataFrame like '\xa0shopName' , '\xa0Street Adress', '\xa0'.
I've tried df = df.replace(u'\xa0', u'')
, but it's only replacing '\xa0', the strings with \xa0 and words stay the same. Maybe regex df = re.sub('#regular expression', '', df)
will help, but i cannot find correct regex sentence :/
CodePudding user response:
You can use just .strip
to remove that charter if it show on the beginning o end of your strings
>>> a='\xa0Street Adress'
>>> a[0]
'\xa0'
>>> a.strip()
'Street Adress'
CodePudding user response:
I believe you're running into an issue with how something is presented versus how it's represented. The hex a0 is decimal 160 and represented in a string as \xa0
. Do you have the string literal \xa0
or is the presentation showing you \xa0
? If its the former, you need to escape your backslash (here, I use a raw string instead):
df.replace(r"\xa0", "")
If the latter, your existing code should have worked:
df.replace("\xa0", "")