I have a regex pattern return a list of all the start and stop indices of an occurring string and I want to be able to highlight each occurrence, it's extremely slow with my current setup — using a 133,000 line file it takes about 8 minutes to highlight all occurrences.
Here's my current solution:
if IPv == 4:
v4FoundUnique = v4FoundUnique 1
# highlight all regions found
for j in range(qty):
v4Found = v4Found 1
# don't highlight if they set the checkbox not to
if highlightText:
# get row.column coordinates of start and end of match
# very slow
startIndex = textField.index('1.0 {} chars'.format(starts[j]))
# compute end based on start, using assumption that IP addresses
# won't span lines drastically faster than computing from raw index
endIndex = "{}.{}".format(startIndex.split(".")[0],
int(startIndex.split(".")[1]) stops[j]-starts[j])
# apply tag
textField.tag_add("{}v4".format("public" if isPublic else "private"),
startIndex, endIndex)
CodePudding user response:
So, TKinter has a pretty bad implementation of changing "absolute location" to its row.column format:
startIndex = textField.index('1.0 {} chars'.format(starts[j]))
it's actually faster to do it like this:
for address in v4check.finditer(filetxt):
# address.group() returns matching text
# address.span() returns the indices (start,stop)
start,stop = address.span()
ip = address.group()
srow = filetxt.count("\n",0,start) 1
scol = start-filetxt.rfind("\n",0,start)-1
start = "{}.{}".format(srow,scol)
stop = "{}.{}".format(srow,scol len(ip))
which takes the regex results and the input file to get the data we need (row.colum)
There could be a faster way of doing this but this is the solution I found that works!