I've tried many options to import a text file with this format:
headerline
headerline
headerline
[DD/MM/YY - HH:MM:SS:MS] 00975 026.98 -0000.6 -0000.5 N2
[DD/MM/YY - HH:MM:SS:MS] 00975 026.98 -0000.6 -0000.5 N2
[DD/MM/YY - HH:MM:SS:MS] 00974 026.98 -0000.6 -0000.5 N2 ...
The target is to create a 2D array that contains each value from a line as an individual element, with the lines arranged as rows. So far, I only managed to get one entire line as an element in the array using numpy.genfromtxt:
data = numpy.genfromtxt("test.txt", skip_header=3, delimiter=" ")
Any help is much appreciated!
CodePudding user response:
1- For a simple and Pythonic way you can do the following:
# Start reading the entire file using readlines() method.
with open('test.txt') as my_file:
my_array = my_file.readlines()
# After that skip empty lines iterating over my_rray:
my_array = [l.split(" ") for l in my_array if l != "\n"]
# Finally create a numpy array from your python list:
data = np.asarray(my_array)
2- On the other hand, you can use genfromtxt
method for a more elegant solution and solve it in one line:
data = numpy.genfromtxt('test.txt', dtype=str, delimiter=" ")
Note that:
- I'm telling numpy to treat values as strings
- Numpy automatically filter empty rows in your file
More info about genfromtxt here
Result for your first two lines should be looks similar to the following:
array([['[DD/MM/YY', '-', 'HH:MM:SS:MS]', ' 00975', ' 026.98', '-0000.6',
'-0000.5', 'N2\n'],
['[DD/MM/YY', '-', 'HH:MM:SS:MS]', ' 00975', ' 026.98', '-0000.6',
'-0000.5', 'N2\n']], dtype='<U12')
I hope I have understood well what you want to do and that my answer would be useful!