Home > other >  Why does .replace() leave whitespace at the end of some strings?
Why does .replace() leave whitespace at the end of some strings?

Time:01-03

I'm trying to extract specific info from a text data. The text data include a name of a person and his/her marks from school. The text data has this format:

Xxxxx Yyyyyy: B
Aaaaa Bbbbbb: A
Ccccc Dddddd: C
.
.
.
Mmmmm Nnnnnn: B

This was a task in a data science course in Coursera where we need to extract the names of students with B marks only to a list using regex from python. I already did it using regex and currently trying to do an alternative way.

I tried this:

def grades():
    with open ("./grades.txt", "r") as file:
        grades = file.read()
    
    grades = grades.splitlines()
    matches = []
    for marks in grades:
        if ": B" in marks:
            matches.append(marks)
    matches = [match.replace(': B', '') for match in matches]
    return matches
print(grades())

Somehow it worked but it left some whitespace after some names. Can anyone explain to me why?

CodePudding user response:

It coud happen there is a space after the 'B'.

match.replace(': B', '') you are only replacing ': B' with an empty string. Any leftover spaces after that are still there.

CodePudding user response:

I think your file has some whitespaces after the name or after ': B'. Use strip() to remove the spaces

Use this code.

def grades():
    with open("./grades.txt", "r") as file:
        grades = file.read()

    grades = grades.splitlines()
    matches = [marks.replace(": B", "").strip() for marks in grades if ": B" in marks]

    return matches


print(grades())

Also no need to loop two times on grades .

CodePudding user response:

There could be some issues like whitespaces. They should never be neglected. I'd recommend you to use strip function at the end of replace clause. It won't hurt if there are no whitespaces but certainly help you whenever there are some

def grades():
with open ("./grades.txt", "r") as file:
    grades = file.read()

grades = grades.splitlines()
matches = []
for marks in grades:
    if ": B" in marks:
        matches.append(marks)
matches = [match.replace(': B', '').strip() for match in matches]
return matches
print(grades())
  • Related