Home > Blockchain >  How can I extract a substring from a string, avoiding including the delimiters?
How can I extract a substring from a string, avoiding including the delimiters?

Time:05-19

I'm having some trouble to extract a substring without including the delimiters.

x =  "- dubuque3528 [21/Jun/2019:15:46:"

or

x = "- - [21/Jun/2019:15:46:"

user_name = re.findall('.-.*\[', x)

That returns: "- dubuque3528 [" or "- - [". I would like to retrieve "dubuque3528" or "-" instead.

CodePudding user response:

With your shown samples, please try following regex.

-\s (\S )\s \[

Here is the Online demo for above regex.

You can run this above regex in Python like as follows, written and tested in Python3:

import re
x = ["- dubuque3528 [21/Jun/2019:15:46:", "- - [21/Jun/2019:15:46:"]
for val in x:
  m = re.search(r'-\s (\S )\s \[', val)
  if m:
    print(m.group(1))

Output will be as follows:

dubuque3528
-

Explanation of above regex:

-\s    ##Matching hash followed by 1 or more occurrnces of spaces.
(\S )  ##Creating 1st capturing group where matching 1 or more non-spaces here.
\s \[  ##Matching 1 or more occurrences of spaces followed by [.

CodePudding user response:

You can use

-\s*(.*?)\s*\[

See the regex demo. Details:

  • - - a hyphen
  • \s* - zero or more whitespaces
  • (.*?) - Group 1: any zero or more chars other than line break chars as few as possible
  • \s* - zero or more whitespaces
  • \[ - a [ char.

See the Python demo:

import re
x = ["- dubuque3528 [21/Jun/2019:15:46:", "- - [21/Jun/2019:15:46:"]
for s in x:
    m = re.search(r'-\s*(.*?)\s*\[', s)
    if m:
        print(m.group(1))
  • Related