We have two kind of coordinates in our project. The normal one with x, y and z
Example:
101, 520, 62
960.93 764.22 59.20
And the extended version with 6 digits (2x xyz for position and rotation) Example:
101 520 62 3 0 0
960.93 764.22 59.20 -0.34 0.00 -89.81
They can be negative, they can be floats and can be rounded numbers. They can be separated by comma or by nothing
Using python, I am trying to find any coordinates in a string.
Example:
textbefore 101, 520, 62
GOTO 960.93 796.22 59.20 -0.34 0.00 -89.81
$5GOTO 1960.93 1796.22 159.20 -0.34 0.00 -89.81
501, 513, 162
1040, 1040, 520 text after
error
222, 222
1500, 1500, 60 (1)
1337 1337 65
124.5, 133.6, 35.4
15:13:26 Condition: index_ != StringList::npos [line 178](125, 157, 215)
Allocating shadow map cache 6324 x 6324: 76.28 MB
In the perfect world the output should be:
101 520 62
960.93 796.22 59.20 -0.34 0.00 -89.81
1960.93 1796.22 159.20 -0.34 0.00 -89.81
501 513 162
1040 1040 520
1500 1500 60
1337 1337 65
124.5 133.6 35.4
125 157 215
The last line with "Allocating shadow maps, is a bit tricky and if this fails and gets listed as coordinate, its fine.
I used this code here, which filters the numbers very good, then I was checking for 6 or 3 numbers, but I have problems with lines which have more numbers. So I need somehow a logic which checks if there numbers are "close" to each other or even separated by words.
re.findall("[- ]?[.]?[\d] (?:,\d\d\d)*[\.]?\d*(?:[eE][- ]?\d )?", line)
If possible the code should work on Python 2.7 (Sadly we are far behind).
Thanks
CodePudding user response:
You can use below regex for this
(?:(?:[ -]?\d \.?\d*[ ,] ){5}[ -]?\d \.?\d*)|(?:(?:[ -]?\d \.?\d*[ ,] ){2}[ -]?\d \.?\d*)
This will search for 2/5 consecutive numbers with ,
or space
delimited and a 3rd/6th number with a non-digit value.
Here is a demo.
Output
['101, 520, 62',
'960.93 796.22 59.20 -0.34 0.00 -89.81',
'1960.93 1796.22 159.20 -0.34 0.00 -89.81',
'501, 513, 162',
'1040, 1040, 520',
'1500, 1500, 60',
'1337 1337 65',
'124.5, 133.6, 35.4',
'125, 157, 215']
CodePudding user response:
Before getting too far into the code, you need to figure out an algorithm or method in psuedocode which will do what you ask. In this example you need to create python code to identify a number:
def is_number(input):
if type(input) == int or type(input) == float:
return True
else:
return False
Then I would split on spaces or commas and parse through the array you create looking for 3 or 6 Trues in a row
CodePudding user response:
s = '''textbefore 101, 520, 62
GOTO 960.93 796.22 59.20 -0.34 0.00 -89.81
$5GOTO 1960.93 1796.22 159.20 -0.34 0.00 -89.81
501, 513, 162
1040, 1040, 520 text after
error
222, 222
1500, 1500, 60 (1)
1337 1337 65
124.5, 133.6, 35.4
15:13:26 Condition: index_ != StringList::npos [line 178](125, 157, 215)
Allocating shadow map cache 6324 x 6324: 76.28 MB'''
s = s.split('\n')
s_row = []
for i in range(len(s)):
s_row.append(s[i].replace(',', '').split(' '))
coord = []
for i in range(len(s_row)):
coord_row = []
for j in range(len(s_row[i])):
try:
s_row[i][j] = float(s_row[i][j])
coord_row.append(s_row[i][j])
except ValueError:
None
if coord_row != []:
coord.append(coord_row)
will give you following output:
[[101.0, 520.0, 62.0]
[960.93, 796.22, 59.2, -0.34, 0.0, -89.81]
[1960.93, 1796.22, 159.2, -0.34, 0.0, -89.81]
[501.0, 513.0, 162.0]
[1040.0, 1040.0, 520.0]
[222.0, 222.0]
[1500.0, 1500.0, 60.0]
[1337.0, 1337.0, 65.0]
[124.5, 133.6, 35.4]
[157.0]
[6324.0, 76.28]]
CodePudding user response:
You can do this with a regular expression. Since the re
module that ships with python doesn't handle repeating or nested capture groups well, you are best off assembling a larger regex from parts.
r"([- ]?[\d ][\.\d]*)"
is a regular expression that will match a decimal or float with optional sign. [- ]?
matches the sign, [\d ]
matches at least 1 decimal before a dot, [\.\d]*
matches the optional fractional part of the float and the outer ()
tells the regex to emit the captured string.
r"[ ,]*"
is the separator between the decimal/floats. Now you could just write out 3 and 6 of these things together, but the code below does that with a bit of python.
import re
import io
test = io.StringIO("""textbefore 101, 520, 62
GOTO 960.93 796.22 59.20 -0.34 0.00 -89.81
$5GOTO 1960.93 1796.22 159.20 -0.34 0.00 -89.81
501, 513, 162
1040, 1040, 520 text after
error
222, 222
1500, 1500, 60 (1)
1337 1337 65
124.5, 133.6, 35.4
15:13:26 Condition: index_ != StringList::npos [line 178](125, 157, 215)
Allocating shadow map cache 6324 x 6324: 76.28 MB""")
# assemble regex from matching 1 decimal or float coord into 3 or 6
_one = r"([- ]?[\d ][\.\d]*)"
_sep = "[ ,] "
coord_3 = re.compile(_sep.join([_one]*3))
coord_6 = re.compile(_sep.join([_one]*6))
coords = []
for line in test:
match = coord_6.search(line) or coord_3.search(line)
if match is not None:
print(match.groups())
coords.append(" ".join(match.groups()))
for c in coords:
print(c)