Data:
<CPSs item="1">
<CKS item="1">
<Name item="1">join.jciie.tab</Name>
<Address item="1">
<DNSname item="1">join.jciie.tab</DNSname>
<Address item="1">30.49.54.147</Address>
<Priority item="1">65036</Priority>
<Status item="1">Active</Status>
<Port item="1">403</Port>
</Address>
</CKS>
</CPSs>
I want to get data between <Address item="1">
and </Address>
. Result I want is 30.49.54.147, but not <DNSname item="1">join.jciie.tab</DNSname> <Address item="1">30.49.54.147
.
I tried the code below, but it didn't work.
import re
address_pattern = r'<Address item="1">(.*?)</Address>'
cks_config = re.search(address_pattern, data).group(1)
print(cks_config)
How can I accomplish this ?
CodePudding user response:
<DNSname item="1">join.jciie.tab</DNSname> <Address item="1">30.49.54.147
Your code apparently catched first (outer) Address
opening tag and last (inner) Address
closing tag. If you are sure it will always hold IP address in decimal form, you might try using
address_pattern = r'<Address item="1">([\.\d] )</Address>'
Explanation: look for such Address tag that it contain solely digits (\d
) and dots (\.
)