Home > Mobile >  Need help removing the remaining whitespace in a string in Python
Need help removing the remaining whitespace in a string in Python

Time:07-14

I need help with a school assignment trying to remove all the whitespace from the non-newline lines in my output in Python 3. I want to do this so I can convert my string into a list. I can use RegEx but not xml in my code. Currently when I run my program it outputs this:

Belgian Waffles
    $5.95
    Two of our famous Belgian Waffles with plenty of real maple syrup
    650


    Strawberry Belgian Waffles
    $7.95
    Light Belgian waffles covered with strawberries and whipped cream
    900


    Berry-Berry Belgian Waffles
    $8.95
    Light Belgian waffles covered with an assortment of fresh berries and whipped cream
    900


    French Toast
    $4.50
    Thick slices made from our homemade sourdough bread
    600


    Homestyle Breakfast
    $6.95
    Two eggs, bacon or sausage, toast, and our ever-popular hash browns
    950
    (end of output)

The result I'm trying to get is this:

Belgian Waffles
$5.95
Two of our famous Belgian Waffles with plenty of real maple syrup
650


Strawberry Belgian Waffles
$7.95
Light Belgian waffles covered with strawberries and whipped cream
900


Berry-Berry Belgian Waffles
$8.95
Light Belgian waffles covered with an assortment of fresh berries and whipped cream
900


French Toast
$4.50
Thick slices made from our homemade sourdough bread
600


Homestyle Breakfast
$6.95
Two eggs, bacon or sausage, toast, and our ever-popular hash browns
950
(end of output)

This is the current code that I have right now:

import os
import re


def get_filename():
    print("Enter the name of the file: ")
    filename = input()
    return filename


def read_file(filename):
    if os.path.exists(filename):
        with open(filename, "r") as file:
            full_text = file.read()
            return full_text
    else:
        print("This file does not exist")


def get_tags(full_text):
    tags = re.findall('<.*?>', full_text)
    for tag in tags:
        full_text = full_text.replace(tag, '')
    return tags


def get_text(text):
    tags = re.findall('<.*?>', text)
    for tag in tags:
        text = text.replace(tag, '')
        text = text.strip()
    return text


def display_output(text):
    print(text)


def main():
    filename = get_filename()
    full_text = read_file(filename)

    tags = get_tags(full_text)
    text = get_text(full_text)

    display_output(text)


main()

I want it to output without the whitespace so that I can convert this string to a list without those left aligned whitespace being counted as elements. Any help or suggestions would be appreciated.

CodePudding user response:

Split the page by the newline character \n and loop through the list.

page = ...

new_page = ''
for line in page.split('\n'):
    new_page  = line.lstrip().rstrip()

Adding the lines to a new string variable.

Good day.

  • Related