Unable to understand the looping logic-CodePudding

I am trying to solve a problem using Python. The problem: Given a string like 'a b c d', replace the space with '_' for all combinations. So here, the expected output would be: 'a_b c d', 'a_b_c d', 'a_b_c_d', 'a b_c d', 'a b_c_d', 'a b c_d'

I am using a sliding window approach for this, here's my code:

ip = 'a b c d'
org = ip
res = []

for i in range(len(ip)):
    if ip[i] == ' ': i  = 1
    for j in range(i   1, len(ip)):
        if ip[j] == ' ':
            ip = ip[:j]   '_'   ip[j 1:]
            res.append(ip)
        j  = 1
    i  = 1
    ip = org

The problem is, the 2nd for loop is looping twice and appending duplicate results.

Result: ['a_b c d', 'a_b_c d', 'a_b_c_d', 'a b_c d', 'a b_c_d', 'a b_c d', 'a b_c_d', 'a b c_d', 'a b c_d']

I am not able to figure why this is happening, would appreciate any help.

Thanks!

CodePudding user response：

Beyond itertools.product, there's a neat binary trick you can do here since you're essentially flipping "bits" at given indices in the string:

ip = 'a b c d'

# Figure out where the spaces are.
space_indices = [i for i, x in enumerate(ip) if x == ' ']
# Loop over all numbers from 0 (no bits set) to 2^N-1 (all bits set)
for x in range(1 << len(space_indices)):
    new_ip = list(ip)
    for i, pos in enumerate(space_indices):  # loop over every "bit"
        if x & (1 << i):  # if the i-th bit is set...
            new_ip[pos] = '_'  # ... replace with an underscore
    print("".join(new_ip))

This prints out

a b c d
a_b c d
a b_c d
a_b_c d
a b c_d
a_b c_d
a b_c_d
a_b_c_d

For a reverse order, you can switch to if not.

The loop body could also be

    # Figure out which positions to replace with underscores
    # by looking at which bits are set in `x`.
    to_replace = {pos for i, pos in enumerate(space_indices) if x & (1 << i)}
    # Print out a string with some of the characters replaced.
    print("".join("_" if pos in to_replace else c for pos, c in enumerate(ip)))

if you like a more functional approach.

CodePudding user response：

I would use itertools.product here:

from itertools import product, chain

ip = 'a b c d'
words = ip.split()

res = [''.join(chain(*zip(words, p), words[-1]))
       for p in product([' ', '_'], repeat=len(words)-1)]

Output:

['a b c d',
 'a b c_d',
 'a b_c d',
 'a b_c_d',
 'a_b c d',
 'a_b c_d',
 'a_b_c d',
 'a_b_c_d']

CodePudding user response：

Thanks so much for your help everyone, especially @Axe319. The issue was with the second line in the for loop, here's the fixed solution:

ip = 'a b c d'
org = ip
res = []

for i in range(len(ip)):
    if ip[i] == ' ': continue
    for j in range(i   1, len(ip)):
        if ip[j] == ' ':
            ip = ip[:j]   '_'   ip[j 1:]
            res.append(ip)
        j  = 1
    i  = 1
    ip = org

CodePudding user response：

With str.find and re.sub functions (with specifying the maximum number of pattern occurrences to be replaced):

import re

ip = 'a b c d'
sep_cnt = ip.count(' ')  # count number of spaces
res = []
sep_pos = ip.find(' ')  # get separator position
for i in range(sep_cnt):
    for j in range(sep_cnt - i):  # countdown replacements
        res.append(ip[:sep_pos]   re.sub(r' ', '_', ip[sep_pos:], j   1))
    sep_pos = ip.find(' ', sep_pos   1)

print(res)

['a_b c d', 'a_b_c d', 'a_b_c_d', 'a b_c d', 'a b_c_d', 'a b c_d']