Home > Software design >  Removing \xf characters
Removing \xf characters

Time:10-30

I am trying to remove all

\xf0\x9f\x93\xa2, \xf0\x9f\x95\x91\n\, \xe2\x80\xa6,\xe2\x80\x99t 

type characters from the below strings in Python

"b'Hello! \xf0\x9f\x93\xa2 End Climate Silence is looking for volunteers! \n\n1-2 hours per week. \xf0\x9f\x95\x91\n\nExperience doing digital research\xe2\x80\xa6

"b'I doubt if climate emergency 8s real, I think people will look ba\xe2\x80\xa6 '

"b'No, thankfully it doesn\xe2\x80\x99t. Can\xe2\x80\x99t see how cheap to overtourism in the alan alps can h\xe2\x80\xa6"

"b'Climate Change Poses a WidelllThreat to National Security "

"b""This doesn't feel like targeted propaganda at all. I mean states\xe2\x80\xa6"

"b'berates climate change activist who confronted her in airport\xc2\xa0 

I am trying

string.encode('ascii', errors= 'ignore') 

and regex but without luck. It will be helpful if I can get some suggestions.

CodePudding user response:

try decoding the bytes.

text=b'Hello! \xf0\x9f\x93\xa2 End Climate Silence is looking for volunteers! \n\n1-2 hours per week. \xf0\x9f\x95\x91\n\nExperience doing digital research\xe2\x80\xa6'.decode("utf8")
print(text) 
>> Hello!            
  • Related