Home > Enterprise >  How do I "not" select first character in a regex pattern?
How do I "not" select first character in a regex pattern?

Time:12-25

I am a RegEx beginner and trying to identify the endings of different statements in sms. See screenshot below.

How can I avoid selecting the next letter following by a full-stop that indicates ending of a statement.

Note that some statements have <.><Alphabets> while some have <.><space><Alphabets>

Regex used: r"\. ?[\D]"

Sample SMS: - I want to select just the full-stop and space if any.

Txn of USD 00.00 done using TC XX at POS*MERCH on 30-Feb-22. Avl bal:USD 00.00. Call xxxxxx for dispute or SMS BLOCK xxxx to xxxxxxx

Acct XX debited with USD XX.00 on some date.Info: ABC*BDECS-XYZ.Avbl Bal:USD yy,xxx.95.Call xxxxxx for dispute or SMS BLOCK xx to xxxxx

screenshot from RegExr on regular pattern

CodePudding user response:

What you're looking for is a look-ahead group. Whether you make that a positive look-ahead and use the negated character set \D or a negative look-ahead with the character set \d doesn't really matter- I'll outline both below:

regex = r". ?(?=\D)"  # asserts that the following character matches \D
regex = r". ?(?!\d)"  # asserts the following character does NOT match \d

There's also look-behind variants (?<!pattern) and (?<=pattern), which assert that the pattern doesn't/does match just before the current position.

None of these groups capture the matched text- they just "look ahead" or "look behind" without changing state.

  • Related