Print a string without any other characters except letters, and replace the space with an underscore-CodePudding

I need to print a string, using this rules: The first letter should be capital and make all other letters are lowercase. Only the characters a-z A-Z are allowed in the name, any other letters have to be deleted(spaces and tabs are not allowed and use underscores are used instead) and string could not be longer then 80 characters.

It seems to me that it is possible to do it somehow like this:

name = "hello2 sjsjs- skskskSkD"
string = name[0].upper()   name[1:].lower()
lenght = len(string) - 1
answer = ""
for letter in string:
     x = letter.isalpha()
     if x == False:
        answer = string.replace(letter,"")
........
return answer

I think it's better to use a for loop or isalpha () here, but I can't think of a better way to do it. Can someone tell me how to do this?

CodePudding user response：

For one-to-one and one-to-None mappings of characters, you can use the .translate() method of strings. The string module provides lists (strings) of the various types of characters including one for all letters in upper and lowercase (string.ascii_letters) but you could also use your own constant string such as 'abcdef....xyzABC...XYZ'.

import string
def cleanLetters(S):
    nonLetters = S.translate(str.maketrans('','',' ' string.ascii_letters))
    return S.translate(str.maketrans(' ','_',nonLetters))

Output:

cleanLetters("hello2 sjsjs- skskskSkD")
'hello_sjsjs_skskskSkD'

CodePudding user response：

One method to accomplish this is to use regular expressions (regex) via the built-in re library. This enables the capturing of only the valid characters, and ignoring the rest.

Then, using basic string tools for the replacement and capitalisation, then a slice at the end.

For example:

import re

name = 'hello2 sjsjs- skskskSkD'
trans = str.maketrans({' ': '_', '\t': '_'})

''.join(re.findall('[a-zA-Z\s\t]', name)).translate(trans).capitalize()[:80]

>>> 'Hello_sjsjs_skskskskd'

CodePudding user response：

Strings are immutable, so every time you do string.replace() it needs to iterate over the entire string to find characters to replace, and a new string is created. Instead of doing this, you could simply iterate over the current string and create a new list of characters that are valid. When you're done iterating over the string, use str.join() to join them all.

answer_l = []
for letter in string:
    if letter == " " or letter == "\t":
        answer_l.append("_") # Replace spaces or tabs with _
    elif letter.isalpha():
        answer_l.append(letter) # Use alphabet characters as-is
    # else do nothing
answer = "".join(answer_l)

With string = 'hello2 sjsjs- skskskSkD', we have answer = 'hello_sjsjs_skskskSkD';

Now you could also write this using a generator expression instead of creating the entire list and then joining it. First, we define a function that returns the letter or "_" for our first two conditions, and an empty string for the else condition

def translate(letter):
    if letter == " " or letter == "\t":
        return "_"
    elif letter.isalpha():
        return letter
    else:
        return ""

Then,

answer = "".join(
             translate(letter) for letter in string
         )

To enforce the 80-character limit, just take answer[:80]. Because of the way slices work in python, this won't throw an error even when the length of answer is less than 80.