I'm trying to transform the text in a file according the following rule: for each line, if the line does not begin with "https", add that word to the beginning of subsequent lines until you hit another line with a non-https word.
For example, given this file:
Fruit
https://www.apple.com//
https://www.banana.com//
Vegetable
https://www.cucumber.com//
https://www.lettuce.com//
I want
Fruit-https://www.apple.com//
Fruit-https://www.banana.com//
Vegetable-https://www.cucumber.com//
Vegetable-https://www.lettuce.com//
Here is my attempt:
one = open("links.txt", "r")
for two in one.readlines():
if "https" not in two:
sitex = two
else:
print (sitex "-" two)
Here is the output of that program, using the above sample input file:
Fruit
-https://www.apple.com//
Fruit
-https://www.banana.com//
Vegetable
-https://www.cucumber.com//
Vegetable
-https://www.lettuce.com//
What is wrong with my code?
CodePudding user response:
To fix that we need to implement rstrip()
method to sitex to remove the new line character at the end of the string. (credit to BrokenBenchmark)
second, the print command by default newlines everytime it's called, so we must add the end=""
parameter to fix this.
So your code should look like this
one = open("links.txt", "r")
for two in one.readlines():
if "https" not in two:
sitex = two.rstrip()
else:
print (sitex "-" two,end="")
one.close()
Also always close the file when you are done.
CodePudding user response:
Lines in your file end on "\n"
- the newline character.
You can remove whitespaces (including "\n"
from a string using strip()
(both ends) or rstrip()
/ lstrip()
(remove at one end).
print()
adds a "\n"
at its end by default, you can omit this using
print("something", end=" ")
print("more) # ==> some thingmore in one line
Fix for your code:
# use a context handler for better file handling
with open("data.txt","w") as f:
f.write("""Fruit
https://www.apple.com//
https://www.banana.com//
Vegetable
https://www.cucumber.com//
https://www.lettuce.com//
""")
with open("data.txt") as f:
what = ""
# iterate file line by line instead of reading all at once
for line in f:
# remove whitespace from current line, including \n
# front AND back - you could use rstring here as well
line = line.strip()
# only do something for non-empty lines (your file does not
# contain empty lines, but the last line may be empty
if line:
# easier to understand condition without negation
if line.startswith("http"):
# printing adds a \n at the end
print(f"{what}-{line}") # line & what are stripped
else:
what = line
Output:
Fruit-https://www.apple.com//
Fruit-https://www.banana.com//
Vegetable-https://www.cucumber.com//
Vegetable-https://www.lettuce.com//
See:
[chars] are optional - if not given, whitespaces are removed.
CodePudding user response:
You need to strip the trailing newline from the line if it doesn't contain 'https'
:
sitex = two
should be
sitex = two.rstrip()
You need to do something similar for the else
block as well, as ShadowRanger points out:
print (sitex "-" two)
should be
print (sitex "-" two.rstrip())