I am trying to convert this string '4-6,10-12,16' into a list that looks like this [4,"-",6,10,"-",12,16]. There would be a combination of integers and the special character "-" in the list.
I was trying to use a regex code in python but I could only do it to extract the numbers, however, I need the dashes as well in the list. How can I include dashes with numbers in the list?
Here is my code:
interval='4-6,10-12,16'
import re
l=[int(s) for s in re.findall(r'\b\d \b', interval)]
CodePudding user response:
Try this:
interval='4-6,10-12,16'
import re
l=[int(s) if s.isnumeric() else s for s in re.findall(r'\d |-', interval)]
l
Output:
[4, '-', 6, 10, '-', 12, 16]
CodePudding user response:
You can use
import re
interval='4-6,10-12,16'
l=[int(s) if all(c.isdigit() for c in s) else '-' for s in re.findall(r'\d |-', interval)]
print(l) # => [4, '-', 6, 10, '-', 12, 16]
See the Python demo.
Details:
re.findall(r'\d |-', interval)
extracts digit sequences or-
charsint(s) if all(c.isdigit() for c in s) else '-'
either casts a digit sequence to anint
if the whole match consists of digits, or just returns-
as a string.
CodePudding user response:
Useful functions:
str.isdigit
(orstr.isnumeric
orstr.isdecimal
);itertools.groupby
to group adjacent characters that share a characteristic.
from itertools import groupby
def tokenize_digits_and_dashes(s):
for k, g in groupby(s, key=lambda c: (c.isdigit(), c == '-')):
if k == (True, False):
yield int(''.join(g))
elif k == (False, True):
yield '-'
print(list(tokenize_digits_and_dashes('4-6,10-12,16')))
# [4, '-', 6, 10, '-', 12, 16]
Alternative approach
Your string already contains separators in the form of commas ,
. These are useful! Don't ignore them. You can split the list on the separators using str.split
.
def tokenize_intervals(s):
for interval in s.split(','):
i = interval.split('-')
if len(i) == 2:
yield tuple(int(''.join(w)) for w in i)
elif len(i) == 1:
x = int(''.join(i[0]))
yield (x, x)
print(list(tokenize_intervals('4-6,10-12,16')))
# [(4, 6), (10, 12), (16, 16)]
CodePudding user response:
# By Using Regex #
# -------------- #
import re
interval = '4-6,10-12,16'
s_list = re.findall(r'[\d ] |-', interval)
x = [int(_) if _.isnumeric() else _ for _ in s_list]
print(x)
# By Using the split method #
# ------------------------- #
final_list = []
for _ in interval.split(','):
sub_list = _.split('-')
for i in sub_list:
if i.isnumeric():
final_list.append(int(i))
if sub_list[-1] != I:
final_list.append('-')
print(final_list)
# By Checking Character By Character #
# ---------------------------------- #
z = ""
s = []
count = 0
for _ in interval:
count = 1
if _.isnumeric():
z = _
if count == len(interval):
s.append(int(z))
elif _ == '-':
s.append(int(z))
z = ""
s.append('-')
else:
s.append(int(z))
z = ""
print(s)