Home > Enterprise >  EASY Python: How do I combine lines in a text file in a specific order?
EASY Python: How do I combine lines in a text file in a specific order?

Time:04-22

Goal: If the line does not begin with: "https", add that word to the beginning of subsequent lines until you hit another line with a non-https word.

Text File Contents:

Fruit
https://www.apple.com//
https://www.banana.com//
Vegetable
https://www.cucumber.com//
https://www.lettuce.com//

Desired Output:

Fruit-https://www.apple.com//
Fruit-https://www.banana.com//
Vegetable-https://www.cucumber.com//
Vegetable-https://www.lettuce.com//

Attempt:

one = open("links.txt", "r")
for two in one.readlines():

    if "https" not in two:
        sitex = two
        
    else:
        print (sitex   "-"  two)

Attempt Output:

Fruit
-https://www.apple.com//

Fruit
-https://www.banana.com//       

Vegetable
-https://www.cucumber.com//     

Vegetable
-https://www.lettuce.com//   

PLS HELP.

CodePudding user response:

To fix that we need to implement rstrip() method to sitex to remove the new line character at the end of the string. (credit to BrokenBenchmark)

second, the print command by default newlines everytime it's called, so we must add the end="" parameter to fix this.

So your code should look like this

one = open("links.txt", "r")
for two in one.readlines():
    if "https" not in two:
        sitex = two.rstrip()
    else:
        print (sitex   "-"  two,end="")
one.close()

Also always close the file when you are done.

CodePudding user response:

Lines in your file end on "\n" - the newline character.

You can remove whitespaces (including "\n" from a string using strip() (both ends) or rstrip() / lstrip() (remove at one end).

print() adds a "\n" at its end by default, you can omit this using

print("something", end=" ")
print("more)   # ==> some thingmore in one line

Fix for your code:

# use a context handler for better file handling
with open("data.txt","w") as f:
    f.write("""Fruit
https://www.apple.com//
https://www.banana.com//
Vegetable
https://www.cucumber.com//
https://www.lettuce.com//
""")


with open("data.txt") as f:
    what = ""
    # iterate file line by line instead of reading all at once
    for line in f:
        # remove whitespace from current line, including \n
        # front AND back - you could use rstring here as well
        line = line.strip() 
        # only do something for non-empty lines (your file does not
        # contain empty lines, but the last line may be empty
        if line:
            # easier to understand condition without negation
            if line.startswith("http"):
                # printing adds a \n at the end
                print(f"{what}-{line}") # line & what are stripped
            else:
                what = line

Output:

Fruit-https://www.apple.com//
Fruit-https://www.banana.com//
Vegetable-https://www.cucumber.com//
Vegetable-https://www.lettuce.com//

See:

[chars] are optional - if not given, whitespaces are removed.

CodePudding user response:

You need to strip the trailing newline from the line if it doesn't contain 'https':

sitex = two

should be

sitex = two.rstrip()

You need to do something similar for the else block as well, as ShadowRanger points out:

print (sitex   "-"  two)

should be

print (sitex   "-"   two.rstrip())
  • Related