I tried to extract 'Startpoint' Match from my text file using python regular exp.
and this wasn't displaying me the desired result. am a very beginner and was trying to extract all the matches of the string 'Startpoint' from my text file.someone pls help.
import re
with open('report.txt') as file:
f = file.readlines()
# pattern=(r'(Startpoint\:*\s[a-zA-Z]\w*)')
pattern=(r'([^S..t$]\:*\s[A-Z]\W*)')
print(pattern)
mat=re.match(r'([^S..t$]\:*\s[A-Z]\W*)',f,re.M|re.I)
print(mat)
Also, i have tried, this and since i have used re.sub its removing my my string 'S'
with open('report.txt') as file:
text=file.read()
f=file.readlines()
pat=re.sub(r'([^S..t$]:*\s[A-Z]\W*)'," ",text)#for replacing
print(pat)
i have included some info. from my text here,
exintf_max.txt:3870: Startpoint: EXINTF_DATA0
exintf_max.txt:3920: Startpoint: EXINTF_DATA9
exintf_max.txt:3972: Startpoint: EXINTF_DATA11
exintf_max.txt:4022: Startpoint: top/soc/xintf_cntrl/lv_temp_xintf_rg_chip_1_select_enable_reg
exintf_max.txt:4054: Startpoint: EXINTF_DATA15
exintf_max.txt:4105: Startpoint: top/soc/xintf_cntrl/lv_temp_xintf_rg_chip_0_select_enable_reg
exintf_max.txt:4137: Startpoint: EXINTF_DATA1
exintf_max.txt:4189: Startpoint: EXINTF_DATA6
exintf_max.txt:4241: Startpoint: EXINTF_DATA10
exintf_max.txt:4290: Startpoint: EXINTF_DATA10
exintf_max.txt:4341: Startpoint: top/soc/xintf_cntrl/lv_temp_xintf_rg_XWR_reg
this above information i could get through grep command, i want to get the same using python script.
pls help me. am a very beginner.
CodePudding user response:
Just to check a line of that text..., then you can use it inside a loop.
import re
# txt = "exintf_max.txt:4341: Startpoint: top/soc/xintf_cntrl/lv_temp_xintf_rg_XWR_reg"
txt = "exintf_max.txt:4290: Startpoint: EXINTF_DATA10"
m = re.match("[\w_] .\w :(\d ):\s?(Startpoint):\s?(.*)",txt)
if m:
print(m.groups())
print(m.group(3))
CodePudding user response:
To get the values in Python for the string Startpoint, you could use:
^[^:\n]*:\d :\s*Startpoint:\s*(. )
Explanation
^
Start of string[^:\n]*
Optionally match any character except:
or a newline:\d :
Match 1 digits between colons\s*
Match optional whitespace charsStartpoint:
Match literally\s*(. )
Match optional whitspace chars
If you want to match spaces without newlines, you can change \s
to [^\S\n]
Example code
import re
with open('report.txt') as file:
f = file.read()
pattern = r"^[^:\n]*:\d :\s*Startpoint:\s*(. )"
print(re.findall(pattern, f, re.M))
Output
[
'EXINTF_DATA0',
'EXINTF_DATA9',
'EXINTF_DATA11',
'top/soc/xintf_cntrl/lv_temp_xintf_rg_chip_1_select_enable_reg',
'EXINTF_DATA15',
'top/soc/xintf_cntrl/lv_temp_xintf_rg_chip_0_select_enable_reg',
'EXINTF_DATA1',
'EXINTF_DATA6',
'EXINTF_DATA10',
'EXINTF_DATA10',
'top/soc/xintf_cntrl/lv_temp_xintf_rg_XWR_reg'
]
CodePudding user response:
If you want to grab what comes after Startpoint:
you can just do:
import re
from pathlib import Path
path = Path("report.txt")
pat = re.compile(r'.*Startpoint:\s*(\S*)')
pat.findall(path.read_text())
This basically goes past any match before Startpoint:
then gets whatever comes after it that isn't a space.
This outputs:
['EXINTF_DATA0', 'EXINTF_DATA9', 'EXINTF_DATA11',
'top/soc/xintf_cntrl/lv_temp_xintf_rg_chip_1_select_enable_reg', 'EXINTF_DATA15',
'top/soc/xintf_cntrl/lv_temp_xintf_rg_chip_0_select_enable_reg',
'EXINTF_DATA1', 'EXINTF_DATA6', 'EXINTF_DATA10', 'EXINTF_DATA10',
'top/soc/xintf_cntrl/lv_temp_xintf_rg_XWR_reg']