I have a string s
that looks like the following:
s = "{[9 -9 '\\\\N' 28 '-2' '0.000' '\\\\N' '1.0000']\n]}"
How can obtain a list l
that extracts the numbers and \\\\N
so that the list looks like the following:
l = [9, -9, '\\\\N', 28, -2, 0.000, '\\\\N', 1.0000]
I tried to use re.findall('[-]?\d [.]?[\d]*', s)
but it only extracts the numbers. How should I modify my regular expression to include \\\\N
?
CodePudding user response:
You are almost correct, you can modify your pattern to: r"-?\d (?:\.\d )?|\\\\N"
Test regex here: https://regex101.com/r/U3uEyQ/1
Python code:
s = "{[9 -9 '\\\\N' 28 '-2' '0.000' '\\\\N' '1.0000']\n]}"
re. findall(r"-?\d (?:\.\d )?|\\\\N", s)
# ['9', '-9', '\\\\N', '28', '-2', '0.000', '\\\\N', '1.0000']
CodePudding user response:
Alternative solution without regex:
s = """{[9 -9 '\\\\N' 28 '-2' '0.000' '\\\\N' '1.0000']\n]}"""
def remove_chars(text):
for ch in ['{', '}', '[', ']', '\n', '\'']:
text = text.replace(ch, "")
return text
result = remove_chars(s).split()
print(result)
Prints
['9', '-9', '\\\\N', '28', '-2', '0.000', '\\\\N', '1.0000']
if you like to convert string digits to int and float extend the code with the following:
# convert str to int and float if possible
def is_digit(item):
if item.isdigit():
return int(item)
else:
try:
float(item)
return float(item)
except ValueError:
return item
result_types = [is_digit(x) for x in result]
print(result_types)
Prints
[9, -9.0, '\\\\N', 28, -2.0, 0.0, '\\\\N', 1.0]