Home > Software design >  How to find abbreviations in a sentence and just remove the full stop in the abbreviations?
How to find abbreviations in a sentence and just remove the full stop in the abbreviations?

Time:10-19

I am trying to find abbreviations in a sentence with python, for example, u.s.a. equals to usa, so I want to find u.s.a. and remove the full stop in this abbreviation and get usa as the result. 'I come from u.s.a..' Then will become 'I come from usa.' How to do with it? Now I can only find all the abbreviations with regex pattern = re.compile(r'(?:[a-z]\.){2,}'), but cannot just remove the full stop.

CodePudding user response:

I see (at least) 2 options:

  1. make a list of abbrevations and find and replace them

  2. if you are able to find the abbrevation with a regex, you can then do replace(".","") --> replaces "." with "" (nothing).

CodePudding user response:

You can use

import re
text = 'some A.B.B.R.E.V. here.'
pattern = re.compile(r'\b(?:[a-z]\.){2,}', re.I)
text = pattern.sub(lambda m: m.group().replace('.',''), text)

See the Python demo. Output:

some ABBREV here.

The callable used as a replacement argument in re.sub operates on the match value found (assigned to m) and the m.group().replace('.','') removes all . chars from the match, and the changed match is used to replace the found match.

  • Related