How to not load zeros from a txt file with numpy.loadtxt?-CodePudding

I have a huge amount of .txt files that consist of one-column numeric values, such as:

-0.42424
0.5466
0.9
-0.4577
0
1.32
-0.933
...

Using the code

import numpy as np
My_data = np.loadtxt("/pathtodata")

loads My_data into Python. Is there any possibilty to tell np.loadtxt that it should not load zero values (0) or at least replace them by another value of choice? Of course, one could remove or replace zeros in all txt files by hand. But the number of txt files and the list of values they contain is massive. Therefore, I am looking for an option to do this in Python, possibly without changing the actual data files.

I don't want to remove values/rows that start with 0, but rows that only contain 0, such as row 5 in my example above.

CodePudding user response：

I would do it following way, let file.txt content be

-0.42424
0.5466
0.9
-0.4577
0
1.32
-0.933

then

import numpy as np
def getnonzeros(filename):
    with open(filename,"rb") as f:
        for line in f:
            if line.strip() == b"0":
                continue
            yield line

arr = np.loadtxt(getnonzeros("file.txt"))
print(arr)

output

[-0.42424  0.5466   0.9     -0.4577   1.32    -0.933  ]

Explanation: np.loadtxt can accept bytes yielding generator so I craft suitable one. It does iterate over lines (so there is no loading of whole file into memory) of file open in read-binary mode, skipping lines which are equal to b"0" after jettisoning leading and trailing whitespaces. Disclaimer: this code assume zeros in your file are always rendered as 0 not for example 0.0000.

CodePudding user response：

This seems like an elegant solution to me. However, it removes the 0s after the data has been imported, not while it's being imported. (Not sure if that matters.)

import numpy as np

my_data = np.loadtxt("pathtodata")
my_data = my_data[~(my_data==0)]