Home > database >  Python : Split a string into fragments based on a list of separators
Python : Split a string into fragments based on a list of separators

Time:11-27

I'm kinda struggling to fix the following case...

Imagine this string :

str = "three hundred   four - fifty six * eight"

Is there a way to get the following array :

array = ["three hundred", " ", "four", "-", "fifty six", "*", "eight"]

knowing that I have a list of multiple operators (used as delimiters in the string I guess) ?

spliting the string on the space delimiter is easy but I would like to keep every delimited part as one item of my list !

Also, is this possible without using any import like re for example?

Thanks in advance !

CodePudding user response:

You could get this done with a simple regular expression, assuming you only need words

import re
s = "three hundred   four - fifty six * eight"
print(re.findall(r"\w ",s))

result: ['three', 'hundred', 'four', 'fifty', 'six', 'eight']

CodePudding user response:

in more algorithmic way:

def split(string_sep, separators):
    res = []
    last = 0  # the last position of an separators
    index = 0
    for index, char in enumerate(string_sep):
        if char in operators:
            res.append(string_sep[last:index].strip())  # strip if you dont want space enter separtors and words
            res.append(char)
            last = index   1  #  1 to not take the separator

    # for the last add to the list
    if last <= index:
        res.append(string_sep[last:])
    return res

CodePudding user response:

Here's a function that does what you want with that string. However, the problem would be more about how to deal with improperly formatted strings. The comment by Chris in the answer points you to a question that talks about the tokenizing with an abstract syntax tree, which is what you'd really need. Essentially that's a bit like writing the re module from scratch. Anyway:

def deconstructor(sample, delims):
    result = []
    loader = []
    for item  in sample:
        if item not in delims:
            loader.append(item)
        else:
            result.append(''.join(loader).strip())
            loader.clear()
            result.append(item) #add that delimiter to list
    if loader: #if not required for properly formatted string
        result.append(''.join(loader).strip())
         
    return result
>>> deconstructor("three hundred   four - fifty six * eight", (' ', '-', '*', '/'))
>>> ['three hundred', ' ', 'four', '-', 'fifty six', '*', 'eight']

CodePudding user response:

Uses no imports:

mstr = "three foo fifty bar   four foo - fifty six * eight"
dels = ['-', '*', ' ', '/']

# find delimeters
split_at = [0]

for item in dels:
    indices = [i for i, x in enumerate(mstr) if x == item]
    
    for index in indices:
        split_at.append(index)

split_at = sorted(split_at)

# split at delimeters
split_str = []
split_str.append(mstr[:split_at[1]])

for split_id in range(2, len(split_at)):
    split_str.append(mstr[split_at[split_id-1]])
    split_str.append(mstr[split_at[split_id-1] 1:split_at[split_id]])

split_str.append(mstr[split_at[-1]])
split_str.append(mstr[split_at[-1] 1 :])

Result:

['three foo fifty bar ', ' ', ' four foo ', '-', ' fifty six ', '*', ' eight']

CodePudding user response:

We can split a string with the help of regular expression. Here we only need to create a regular expression and you can reduce your code lines.

# Python3 code to demonstrate working of
# Splitting operators in String
# Using re.split()
import re

# initializing string
test_str = "three hundred   four - fifty six * eight"

# printing original string
print("The original string is : "   str(test_str))

# Using re.split()
# Splitting operators in String
res = re.split(r'(\ |\-|\*|\/)', test_str)

# printing result
print("The list after performing split functionality : "   str(res))

To learn how we can create a regular expression you can take help from this link https://www.programiz.com/python-programming/regex I only post this answer for people that want to split the string with the help of re module in python.

  • Related