I am trying to find abbreviations in a sentence with python, for example, u.s.a. equals to usa, so I want to find u.s.a. and remove the full stop in this abbreviation and get usa as the result.
'I come from u.s.a..'
Then will become 'I come from usa.'
How to do with it?
Now I can only find all the abbreviations with regex
pattern = re.compile(r'(?:[a-z]\.){2,}')
, but cannot just remove the full stop.
CodePudding user response:
I see (at least) 2 options:
make a list of abbrevations and find and replace them
if you are able to find the abbrevation with a regex, you can then do replace(".","") --> replaces "." with "" (nothing).
CodePudding user response:
You can use
import re
text = 'some A.B.B.R.E.V. here.'
pattern = re.compile(r'\b(?:[a-z]\.){2,}', re.I)
text = pattern.sub(lambda m: m.group().replace('.',''), text)
See the Python demo. Output:
some ABBREV here.
The callable used as a replacement argument in re.sub
operates on the match value found (assigned to m
) and the m.group().replace('.','')
removes all .
chars from the match, and the changed match is used to replace the found match.