Extract numbers and/or strings from a python string using regular expression-CodePudding

I have a string s that looks like the following:

s = "{[9 -9 '\\\\N' 28 '-2' '0.000' '\\\\N' '1.0000']\n]}"

How can obtain a list l that extracts the numbers and \\\\N so that the list looks like the following:

l = [9, -9, '\\\\N', 28, -2, 0.000, '\\\\N', 1.0000]

I tried to use re.findall('[-]?\d [.]?[\d]*', s) but it only extracts the numbers. How should I modify my regular expression to include \\\\N?

CodePudding user response：

You are almost correct, you can modify your pattern to: r"-?\d (?:\.\d )?|\\\\N"

Test regex here: https://regex101.com/r/U3uEyQ/1

Python code:

s = "{[9 -9 '\\\\N' 28 '-2' '0.000' '\\\\N' '1.0000']\n]}"
re. findall(r"-?\d (?:\.\d )?|\\\\N", s)
# ['9', '-9', '\\\\N', '28', '-2', '0.000', '\\\\N', '1.0000']

Test python here

CodePudding user response：

Alternative solution without regex:

s = """{[9 -9 '\\\\N' 28 '-2' '0.000' '\\\\N' '1.0000']\n]}"""

def remove_chars(text):
    for ch in ['{', '}', '[', ']', '\n', '\'']:
        text = text.replace(ch, "")
    return text

result = remove_chars(s).split()

print(result)

Prints

['9', '-9', '\\\\N', '28', '-2', '0.000', '\\\\N', '1.0000']

if you like to convert string digits to int and float extend the code with the following:

# convert str to int and float if possible
def is_digit(item):
    if item.isdigit():
        return int(item)
    else:
        try:
            float(item)
            return float(item)
        except ValueError:
            return item

result_types = [is_digit(x) for x in result]

print(result_types)

Prints

[9, -9.0, '\\\\N', 28, -2.0, 0.0, '\\\\N', 1.0]