Home > other >  Counting the total number of lines except the ones that start with a special character
Counting the total number of lines except the ones that start with a special character

Time:10-25

I want to get the number of atoms from a text file. This text file starts with a couple of lines of header, and sometimes it may add some additional info lines, which also start with special characters. A sample text file looks like this:

% site-data vn=3.0
#                        pos
Ga        0.0000000   0.0000000   0.0000000
As        0.2500000   0.2500000   0.2500000 

My approach was counting the total number of lines and lines that start with special characters, so here is my attempt:

def get_atom_number():
    count = 0
    with open(sitefile,'r') as site:
        x = len(site.readlines())
        for line in site.readlines():
            if '#' in line or '%' in line:
                count  =1
return x-count

The problem with this function is, with the x (total number of lines) defined, the counter (num. of lines that start with special chars) returns 0. If I delete that line, it works. Now, I can divide these two into two functions, but I believe this should work fine, and I want to know what I'm doing wrong.

CodePudding user response:

if '#' in line or '%' in line: will check if the characters are anywhere in the line. Use startswith instead·

if line.startswith(('#', '%')):

Now, regarding the counting method, you can also increase the counter only when the line is not starting with the characters, thus you don't need to know the total number of lines in advance and do not need to consume all the lines:

if not line.startswith(('#', '%')):
    counter  = 1

Then you can directly print the counter at the end

Full code:

def get_atom_number():
    count = 0
    with open(sitefile,'r') as site:
        for line in site.readlines():
            if not line.startswith(('#', '%')):
                count  =1
    return count

CodePudding user response:

The problem you're facing is that .readlines() consumes the entire file when it executes. If you call it again, nothing comes out, since it's already at the end of the file.

The solution is to assign site.readlines() to a variable, first, and then change the following two lines to refer to that variable. This way, you're only calling it once.

def get_atom_number():
    count = 0
    with open(sitefile,'r') as site:
        lines = site.readlines()
        x = len(lines)
        for line in lines:
            if '#' in line or '%' in line:
                count  =1
    return x - count

CodePudding user response:

Use readline instead

def get_atom_number():
    count = 0
    with open(sitefile,'r') as site:
        for line in site.readline():
            if '#' not in line and '%' not in line:
                count  =1
    return count

CodePudding user response:

The first call site.readlines() in your code at line 4 move the file cursor to the end. So the second call site.readlines() at line 5 only get an empty list. You can try the code below, it saves the result of call site.readlines() to a variable lines. I think it will be working to solve your problem.

def get_atom_number():
    count = 0
    with open(sitefile,'r') as site:
        lines = site.readlines()
        x = len(lines)
        for line in lines:
            if '#' in line or '%' in line:
                count  =1
    return x - count
  • Related