Reformatting string in List - Python-CodePudding

I have a list that I'm trying to reformat.

data = ['Height:\n      \n      6\' 4"', 'Weight:\n      \n      185 lbs.', 'Reach:\n      \n      80"', 'STANCE:\n      \n      Switch', 'DOB:\n      \n      \n        Jul 22, 1989', 'SLpM:\n          \n\n          3.93', 'Str. Acc.:\n          \n          49%', 'SApM:\n          \n          2.67', 'Str. Def:\n          \n          59%', '', 'TD Avg.:\n          \n          0.00', 'TD Acc.:\n          \n          0%', 'TD Def.:\n          \n          78%', 'Sub. Avg.:\n          \n          0.2']

I've tried using strip.

for info in data:
        info.strip('\n      \n      ')

But, I'm still getting the same output.

How would I be able to delete the whitespace of "\n \n " within each index of the list. To get the following?

data = ['Height: 6\' 4"', 'Weight: 185 lbs.', 'Reach: 80"', 'STANCE: Switch', 'DOB: Jul 22, 1989', 'SLpM: 3.93', 'Str. Acc.: 49%', 'SApM: 2.67', 'Str. Def: 59%', '', 'TD Avg.: 0.00', 'TD Acc.: 0%', 'TD Def.: 78%', 'Sub. Avg.: 0.2']

CodePudding user response：

Try this :

import re

def remove_multiple_ws(s: str) -> str:
    return re.sub(r"\s ", " ", str(s))


data = [remove_multiple_ws(s) for s in data]

CodePudding user response：

Here is my approach: Replace the colon and the following blank spaces with a colon and a space:

import re

pattern = re.compile(r":\s*")
new_data = [
    pattern.sub(": ", datum)
    for datum in data
]

new_data then become:

['Height: 6\' 4"',
 'Weight: 185 lbs.',
 'Reach: 80"',
 'STANCE: Switch',
 'DOB: Jul 22, 1989',
 'SLpM: 3.93',
 'Str. Acc.: 49%',
 'SApM: 2.67',
 'Str. Def: 59%',
 '',
 'TD Avg.: 0.00',
 'TD Acc.: 0%',
 'TD Def.: 78%',
 'Sub. Avg.: 0.2']

CodePudding user response：

You can use a re.sub to substitute any duplicate spaces and more.

From the documentation:

re.sub(pattern, repl, string, count=0, flags=0)

Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl. If the pattern isn’t found, string is returned unchanged.

This is a way that re.sub could be used in this situation:

>>> import re
>>> mystring = ' string    string \t\n\n string'
>>> pattern = re.compile(r'\s ')
>>> pattern.sub(" ", mystring)
'string string string'

Using this method, an implementation for your code would look something like this:

pattern = re.compile(r"\s ")
new_data = [pattern.sub(" ",part) for part in data]

Here is what new_data should be:

kali@kali:~$ python3 test.py -i
>>> new_data
['Height: 6\' 4"',
 'Weight: 185 lbs.',
 'Reach: 80"',
 'STANCE: Switch',
 'DOB: Jul 22, 1989',
 'SLpM: 3.93',
 'Str. Acc.: 49%',
 'SApM: 2.67',
 'Str. Def: 59%',
 '',
 'TD Avg.: 0.00',
 'TD Acc.: 0%',
 'TD Def.: 78%',
 'Sub. Avg.: 0.2']

if you want to learn more about regex in python here are some useful links: