Home > Net >  Need help extracting initials from text
Need help extracting initials from text

Time:09-05

I want to get the initials that occur at the end of the text, usually preceded by this symbol '^' and in uppercase, most cases the initials occur at the end of the body of text, sometimes not at the very end, but in all cases, the initials or name is in all uppercase. Examples

  1. Good evening, The following areas will be affected by scheduled power interruptions to carry out network maintenance tomorrow 01/06/2022. ^LS
  2. You're welcome. Answered by: DO

CodePudding user response:

You can use the following:

\s*\^?([A-Z] )$

Explanation:

  • \s* : zero or more white spaces
  • \^? : the character ^ one or more times
  • ([A-Z] ) : the initials (one or more capital letters), saved in capture group #1
  • $ : end of line

Here is a working example

CodePudding user response:

"Usually preceded by ^" and "usually appearing at the end" are not useable pieces of information when composing regex. Regex either matches or doesn't.

Just match more than 1 consecutive capital letter:

[A-Z][A-Z] 

See live demo.

CodePudding user response:

re.findall can be used...


text = '''
Good evening, The following areas will be affected by 
scheduled power interruptions to carry out network 
maintenance tomorrow 01/06/2022. ^LS

Good morning, the power service is restored.
 ^LP
'''

re.findall(r'\s (\^?[A-Z] )\b', text)

['^LS', '^LP']
  • Related