I have got a file (.VAR) which gives me a positions and lengths in a strings per row, see the example below.
*STRING1 1L8:StringONE
*STRINGWITHVARIABLELENGTH2 *ABC 29L4:StringTWO
*STRINGWITHLENGTH3 *ABC 33L2:StringTHREE
How do i retrieve the " xxLxxx:" value, which is always preceded by a space and always ends with a colon, but never on the same location within the string.
Preferably I would like to find the number before L as the position, and the number behind L as the length, but only searching for "L" would give me also the input from other values within the string. Therefore I think I have to use the space_number_L_number_colon to recognize this part, but I don't know how.
Any thoughts? TIA
CodePudding user response:
You can use a regex here.
Example:
s='''*STRING1 1L8:StringONE
*STRINGWITHVARIABLELENGTH2 *ABC 29L4:StringTWO
*STRINGWITHLENGTH3 *ABC 33L2:StringTHREE'''
import re
out = re.findall(r'\s(\d )L(\d ):', s)
output: [('1', '8'), ('29', '4'), ('33', '2')]
As integers:
out = [tuple(map(int, x)) for x in re.findall(r'\s(\d )L(\d ):', s)]
output: [(1, 8), (29, 4), (33, 2)]
regex:
\s # space
(\d ) # capture one or more digits
L # literal L
(\d ) # capture one or more digits
: # literal :