How to replace every third word in a string with the # length equivalent-CodePudding

Input:

string = "My dear adventurer, do you understand the nature of the given discussion?"

expected output:

string = 'My dear ##########, do you ########## the nature ## the given ##########?'

How can you replace the third word in a string of words with the # length equivalent of that word while avoiding counting special characters found in the string such as apostrophes('), quotations("), full stops(.), commas(,), exclamations(!), question marks(?), colons(:) and semicolons (;).

I took the approach of converting the string to a list of elements but am finding difficulty filtering out the special characters and replacing the words with the # equivalent. Is there a better way to go about it?

CodePudding user response：

With help of some regex. Explanation in the comments.

import re


imp = "My dear adventurer, do you understand the nature of the given discussion?"
every_nth = 3  # in case you want to change this later

out_list = []

# split the input at spaces, enumerate the parts for looping
for idx, word in enumerate(imp.split(' ')):

    # only do the special logic for multiples of n (0-indexed, thus  1)
    if (idx   1) % every_nth == 0:
        # find how many special chars there are in the current segment
        len_special_chars = len(re.findall(r'[.,!?:;\'"]', word))  
                                            # ^ add more special chars here if needed
        
        # subtract the number of special chars from the length of segment
        str_len = len(word) - len_special_chars
        
        # repeat '#' for every non-special char and add the special chars
        out_list.append('#'*str_len   word[-len_special_chars])
    else:
        # if the index is not a multiple of n, just add the word
        out_list.append(word)
        

print(' '.join(out_list))

CodePudding user response：

There are more efficient ways to solve this question, but I hope this is the simplest!

My approach is:

Split the sentence into a list of the words

Using that, make a list of every third word.

Remove unwanted characters from this

Replace third words in original string with # times the length of the word.

Here's the code (explained in comments) :

# original line
line = "My dear adventurer, do you understand the nature of the given discussion?"

# printing original line
print(f'\n\nOriginal Line:\n"{line}"\n')

# printing somehting to indicate that next few prints will be for showing what is happenning after each lone
print('\n\nStages of parsing:')

# splitting by spaces, into list
wordList = line.split(' ')

# printing wordlist
print(wordList)

# making list of every third word
thirdWordList = [wordList[i-1] for i in range(1,len(wordList) 1) if i%3==0]

# pritning third-word list
print(thirdWordList)

# characters that you don't want hashed
unwantedCharacters = ['.','/','|','?','!','_','"',',','-','@','\n','\\']

# replacing these characters by empty strings in the list of third-words
for unwantedchar in unwantedCharacters:
    for i in range(0,len(thirdWordList)):
        thirdWordList[i] = thirdWordList[i].replace(unwantedchar,'')

# printing third word list, now without punctuation 
print(thirdWordList)

# replacing with #
for word in thirdWordList:
    line = line.replace(word,len(word)*'#')

# Voila! Printing the result:
print(f'\n\nFinal Output:\n"{line}"\n\n')