Home > database >  How to match data between second occurrence of first marker and second marker with REGEX?
How to match data between second occurrence of first marker and second marker with REGEX?

Time:10-06

Data:

<CPSs item="1">
      <CKS item="1">
        <Name item="1">join.jciie.tab</Name>
        <Address item="1">
          <DNSname item="1">join.jciie.tab</DNSname>
          <Address item="1">30.49.54.147</Address>
          <Priority item="1">65036</Priority>
          <Status item="1">Active</Status>
          <Port item="1">403</Port>
        </Address>
      </CKS>
</CPSs>

I want to get data between <Address item="1"> and </Address>. Result I want is 30.49.54.147, but not <DNSname item="1">join.jciie.tab</DNSname> <Address item="1">30.49.54.147.

I tried the code below, but it didn't work.

    import re
    address_pattern = r'<Address item="1">(.*?)</Address>'
    cks_config = re.search(address_pattern, data).group(1)
    print(cks_config)

How can I accomplish this ?

CodePudding user response:

<DNSname item="1">join.jciie.tab</DNSname> <Address item="1">30.49.54.147

Your code apparently catched first (outer) Address opening tag and last (inner) Address closing tag. If you are sure it will always hold IP address in decimal form, you might try using

address_pattern = r'<Address item="1">([\.\d] )</Address>'

Explanation: look for such Address tag that it contain solely digits (\d) and dots (\.)

  • Related