Home > database >  python string split by separator all possible permutations
python string split by separator all possible permutations

Time:10-13

This might be heavily related to similar questions as Python 3.3: Split string and create all combinations , but I can't infer a pythonic solution out of this.

Question is:

Let there be a str such as 'hi|guys|whats|app', and I need all permutations of splitting that str by a separator. Example:

#splitting only once
['hi','guys|whats|app']
['hi|guys','whats|app']
['hi|guys|whats','app']
#splitting only twice
['hi','guys','whats|app']
['hi','guys|whats','app']
#splitting only three times
...
etc

I could write a backtracking algorithm, but does python (itertools, e.g.) offer a library that simplifies this algorithm?

Thanks in advance!!

CodePudding user response:

Here is a recursive function I came up with:

def splitperms(string, i=0):
    if len(string) == i:
        return [[string]]
    elif string[i] == "|":
        return [*[[string[:i]]   split for split in splitperms(string[i   1:])], *splitperms(string, i   1)]
    else:
        return splitperms(string, i   1)

Output:

>>> splitperms('hi|guys|whats|app')
[['hi', 'guys', 'whats', 'app'], ['hi', 'guys', 'whats|app'], ['hi', 'guys|whats', 'app'], ['hi', 'guys|whats|app'], ['hi|guys', 'whats', 'app'], ['hi|guys', 'whats|app'], ['hi|guys|whats', 'app'], ['hi|guys|whats|app']]
>>> 

CodePudding user response:

One approach using combinations and chain

from itertools import combinations, chain


def partition(alist, indices):
    # https://stackoverflow.com/a/1198876/4001592
    pairs = zip(chain([0], indices), chain(indices, [None]))
    return (alist[i:j] for i, j in pairs)


s = 'hi|guys|whats|app'
delimiter_count = s.count("|")
splits = s.split("|")


for i in range(1, delimiter_count   1):
    print("split", i)
    for combination in combinations(range(1, delimiter_count   1), i):
        res = ["|".join(part) for part in partition(splits, combination)]
        print(res)

Output

split 1
['hi', 'guys|whats|app']
['hi|guys', 'whats|app']
['hi|guys|whats', 'app']
split 2
['hi', 'guys', 'whats|app']
['hi', 'guys|whats', 'app']
['hi|guys', 'whats', 'app']
split 3
['hi', 'guys', 'whats', 'app']

The idea is to generate all the ways to pick (or remove) a delimiter 1, 2, 3 times and generate the partitions from there.

CodePudding user response:

You can find index all '|' then in all combination replace '|' with ',' then split base ',' like below:

>>> from itertools import combinations
>>> st = 'hi|guys|whats|app'
>>> idxs_rep = [idx for idx, s in enumerate(st) if s=='|']

>>> def combs(x):
...    return [c for i in range(len(x) 1) for c in combinations(x,i)]

>>> for idxs in combs(idxs_rep):        
...    lst_st = list(st)
...    for idx in idxs:
...        lst_st[idx] = ','
...    st2 = ''.join(lst_st)
...    print(st2.split(','))

['hi|guys|whats|app']
['hi', 'guys|whats|app']
['hi|guys', 'whats|app']
['hi|guys|whats', 'app']
['hi', 'guys', 'whats|app']
['hi', 'guys|whats', 'app']
['hi|guys', 'whats', 'app']
['hi', 'guys', 'whats', 'app']

CodePudding user response:

An approach, once you have split the string is to use itertools.combinations to define the split points in the list, the other positions should be fused again.

def lst_merge(lst, positions, sep='|'):
    '''merges a list on points other than positions'''
    '''A, B, C, D and 0, 1 -> A, B, C|D'''
    a = -1
    out = []
    for b in list(positions) [len(lst)-1]:
        out.append('|'.join(lst[a 1:b 1]))
        a = b
    return out

def split_comb(s, split=1, sep='|'):
    from itertools import combinations
    l = s.split(sep)
    return [lst_merge(l, pos, sep=sep)
            for pos in combinations(range(len(l)-1), split)]

examples

>>> split_comb('hi|guys|whats|app', 0)
[['hi|guys|whats|app']]

>>> split_comb('hi|guys|whats|app', 1)
[['hi', 'guys|whats|app'],
 ['hi|guys', 'whats|app'],
 ['hi|guys|whats', 'app']]

>>> split_comb('hi|guys|whats|app', 2)
[['hi', 'guys', 'whats|app'],
 ['hi', 'guys|whats', 'app'],
 ['hi|guys', 'whats', 'app']]

>>> split_comb('hi|guys|whats|app', 3)
[['hi', 'guys', 'whats', 'app']]

>>> split_comb('hi|guys|whats|app', 4)
[] ## impossible

rationale

ABCD -> A B C D
         0 1 2

combinations of split points: 0/1 or 0/2 or 1/2

0/1 -> merge on 2 -> A B CD
0/2 -> merge on 1 -> A BC D
1/2 -> merge on 0 -> AB C D

generic function

Here is a generic version, working like above but also taking -1 as parameter for split, in which case it will output all combinations

def lst_merge(lst, positions, sep='|'):
    a = -1
    out = []
    for b in list(positions) [len(lst)-1]:
        out.append('|'.join(lst[a 1:b 1]))
        a = b
    return out

def split_comb(s, split=1, sep='|'):
    from itertools import combinations, chain
    
    l = s.split(sep)
    
    if split == -1:
        pos = chain.from_iterable(combinations(range(len(l)-1), r)
                                  for r in range(len(l) 1))
    else:
        pos = combinations(range(len(l)-1), split)
        
    return [lst_merge(l, pos, sep=sep)
            for pos in pos]

example:

>>> split_comb('hi|guys|whats|app', -1)
[['hi|guys|whats|app'],
 ['hi', 'guys|whats|app'],
 ['hi|guys', 'whats|app'],
 ['hi|guys|whats', 'app'],
 ['hi', 'guys', 'whats|app'],
 ['hi', 'guys|whats', 'app'],
 ['hi|guys', 'whats', 'app'],
 ['hi', 'guys', 'whats', 'app']]
  • Related