I need help with a school assignment trying to remove all the whitespace from the non-newline lines in my output in Python 3. I want to do this so I can convert my string into a list. I can use RegEx but not xml in my code. Currently when I run my program it outputs this:
Belgian Waffles
$5.95
Two of our famous Belgian Waffles with plenty of real maple syrup
650
Strawberry Belgian Waffles
$7.95
Light Belgian waffles covered with strawberries and whipped cream
900
Berry-Berry Belgian Waffles
$8.95
Light Belgian waffles covered with an assortment of fresh berries and whipped cream
900
French Toast
$4.50
Thick slices made from our homemade sourdough bread
600
Homestyle Breakfast
$6.95
Two eggs, bacon or sausage, toast, and our ever-popular hash browns
950
(end of output)
The result I'm trying to get is this:
Belgian Waffles
$5.95
Two of our famous Belgian Waffles with plenty of real maple syrup
650
Strawberry Belgian Waffles
$7.95
Light Belgian waffles covered with strawberries and whipped cream
900
Berry-Berry Belgian Waffles
$8.95
Light Belgian waffles covered with an assortment of fresh berries and whipped cream
900
French Toast
$4.50
Thick slices made from our homemade sourdough bread
600
Homestyle Breakfast
$6.95
Two eggs, bacon or sausage, toast, and our ever-popular hash browns
950
(end of output)
This is the current code that I have right now:
import os
import re
def get_filename():
print("Enter the name of the file: ")
filename = input()
return filename
def read_file(filename):
if os.path.exists(filename):
with open(filename, "r") as file:
full_text = file.read()
return full_text
else:
print("This file does not exist")
def get_tags(full_text):
tags = re.findall('<.*?>', full_text)
for tag in tags:
full_text = full_text.replace(tag, '')
return tags
def get_text(text):
tags = re.findall('<.*?>', text)
for tag in tags:
text = text.replace(tag, '')
text = text.strip()
return text
def display_output(text):
print(text)
def main():
filename = get_filename()
full_text = read_file(filename)
tags = get_tags(full_text)
text = get_text(full_text)
display_output(text)
main()
I want it to output without the whitespace so that I can convert this string to a list without those left aligned whitespace being counted as elements. Any help or suggestions would be appreciated.
CodePudding user response:
Split the page by the newline character \n
and loop through the list.
page = ...
new_page = ''
for line in page.split('\n'):
new_page = line.lstrip().rstrip()
Adding the lines to a new string variable.
Good day.