Home > Enterprise >  Identify lines where word occurs in a text and save the lines in a list
Identify lines where word occurs in a text and save the lines in a list

Time:05-17

I'm currently trying to scan a text, identify the lines in which the word occurs multiple times and save the lines in a list. If the word does not appear in a text, an empty list should be returned.

This is what i have so far;

def line_number(text,word):
    
   
    with open(text) as file: 
        lines = file.readlines() 
    for line_number, line in enumerate(lines,1): 
        if word in line:  
            print(f'{word} is in the line {line_number}')
                  
        else: 
            pass
    print("None")

At this point, I can print out the lines where a word occurs, but I need a way to save the lines

CodePudding user response:

Seems like an SEO based functionality. Maybe what you can do is save the lines very much after the print statement, maybe write that line and close in separate files because you may need it, and assign a trigger to work it, where each line constitutes address from the first line.

CodePudding user response:

You can append a tuple (line, count) for each line in lines:

def line_number(text, word):
    with open(text) as file:
        lines = file.readlines()

    lst = [(x.strip(), x.count(word)) for x in lines if x.strip()]

    return lst

A test file, with word='test':

test test test

test sdf sdfuih test
asdlkj
123

returns

[('test test test', 3),
 ('test sdf sdfuih test', 2),
 ('asdlkj', 0),
 ('123', 0)]

You can then sort or max your occurrences with:

>>> x = line_number("file2.txt", "test")
# Sorted
>>> sorted(x, key=lambda a: a[0], reverse=True)
[('test test test', 3),
 ('test sdf sdfuih test', 2),
 ('asdlkj', 0),
 ('123', 0),
# Max
>>> max(x, key=lambda a: a[0])
('test test test', 3)

Or you can save the line number, instead of the line

# i   1 means first line is 1
lst = [(i 1, x.count(word)) for i, x in enumerate(lines) if x.strip()]

Edit: as requested by your comment, this just gets the line number for occurrences of the word:

# returns [1, 3] for test file above
lst = [i 1 for i, x in enumerate(lines) if word in x]
  • Related