I have a matrix that looks like this in a txt file:
[[0.26263508 0.89992943 0.62171512 0.20750958 0.21195397 0.97217826
0.61573457 0.05643889]
[0.33188798 0.32016444 0.92051048 0.75572024 0.20247452 0.37400282
0.10935296 0.63343081]
[0.87017165 0.7283508 0.80314653 0.80094718 0.74024014 0.16330332
0.76579785 0.75177055]
[0.2629302 0.59727507 0.60866212 0.29746334 0.54587234 0.43876005
0.75007362 0.89742691]
[0.05300406 0.83342629 0.19291691 0.83372532 0.98122163 0.7815009
0.59635085 0.9700382 ]
[0.69259902 0.42779514 0.04766533 0.62205107 0.71423376 0.85045446
0.31985818 0.15338853]
[0.26947509 0.41946874 0.87206754 0.35849082 0.94756447 0.59001803
0.41028535 0.85643487]
[0.87299386 0.70986812 0.87212445 0.30309828 0.31214338 0.33387522
0.52875374 0.75712628]
[0.51605143 0.64374971 0.37821579 0.77055732 0.12504581 0.75814223
0.87462081 0.97378988]
[1.27346865 0.73175293 1.35820425 1.08405559 0.97660218 1.31912378
0.62859619 0.94765808]]
When I try to read it into a program using
inputMatrix = np.loadtxt("testing789.txt", dtype = 'i' , delimiter=' ')
print(inputMatrix)
My problem is that the [
and ]
in the file are strings that cannot be converted to int32. Is there an efficient way to read in this matrix?
CodePudding user response:
Instead of writing the matrix to a file like this:
myFile.write(str(matrix))
,
Write it like this to automatically have it formatted:
np.savetxt(fileName.txt, matrix)
One last thing: Load the matrix from the txt file like so:
inputMatrix = np.loadtxt("testing789.txt", dtype = 'f' , delimiter=' ')
Where dtype = 'f'
is used instead of i
so that the matrix values are not rounded.
CodePudding user response:
The correct answer here is to fix your source file, if you can, to be a standard format such as what is written by np.savetxt
. Then, you can read it simply by np.loadtxt
. If that is not possible, read on.
Since your rows seem to be divided over multiple lines, my initial suggestion (just remove [
and ]
, and then parse) won't work. Instead, you could add commas in the correct places so that the string is a valid python list-of-lists, and use ast.literal_eval
to evaluate it. Then, build a numpy array out of this list of lists.
import ast
import re
with open("fname.txt") as f:
fc = re.sub(" ", ", ", f.read())
lst = ast.literal_eval(fc)
arr = np.array(lst)
The regular expression " "
simply matches one or more spaces, which are replaced by a comma.
Now arr
is the expected array:
array([[0.26263508, 0.89992943, 0.62171512, 0.20750958, 0.21195397,
0.97217826, 0.61573457, 0.05643889],
[0.33188798, 0.32016444, 0.92051048, 0.75572024, 0.20247452,
0.37400282, 0.10935296, 0.63343081],
[0.87017165, 0.7283508 , 0.80314653, 0.80094718, 0.74024014,
0.16330332, 0.76579785, 0.75177055],
[0.2629302 , 0.59727507, 0.60866212, 0.29746334, 0.54587234,
0.43876005, 0.75007362, 0.89742691],
[0.05300406, 0.83342629, 0.19291691, 0.83372532, 0.98122163,
0.7815009 , 0.59635085, 0.9700382 ],
[0.69259902, 0.42779514, 0.04766533, 0.62205107, 0.71423376,
0.85045446, 0.31985818, 0.15338853],
[0.26947509, 0.41946874, 0.87206754, 0.35849082, 0.94756447,
0.59001803, 0.41028535, 0.85643487],
[0.87299386, 0.70986812, 0.87212445, 0.30309828, 0.31214338,
0.33387522, 0.52875374, 0.75712628],
[0.51605143, 0.64374971, 0.37821579, 0.77055732, 0.12504581,
0.75814223, 0.87462081, 0.97378988],
[1.27346865, 0.73175293, 1.35820425, 1.08405559, 0.97660218,
1.31912378, 0.62859619, 0.94765808]])
If you do not have rows divided over multiple lines (i.e. if your file looked like so:)
[[0.26263508 0.89992943 0.62171512 0.20750958 0.21195397 0.97217826 0.61573457 0.05643889]
[0.33188798 0.32016444 0.92051048 0.75572024 0.20247452 0.37400282 0.10935296 0.63343081]
[0.87017165 0.7283508 0.80314653 0.80094718 0.74024014 0.16330332 0.76579785 0.75177055]
[0.2629302 0.59727507 0.60866212 0.29746334 0.54587234 0.43876005 0.75007362 0.89742691]
[0.05300406 0.83342629 0.19291691 0.83372532 0.98122163 0.7815009 0.59635085 0.9700382 ]
[0.69259902 0.42779514 0.04766533 0.62205107 0.71423376 0.85045446 0.31985818 0.15338853]
[0.26947509 0.41946874 0.87206754 0.35849082 0.94756447 0.59001803 0.41028535 0.85643487]
[0.87299386 0.70986812 0.87212445 0.30309828 0.31214338 0.33387522 0.52875374 0.75712628]
[0.51605143 0.64374971 0.37821579 0.77055732 0.12504581 0.75814223 0.87462081 0.97378988]
[1.27346865 0.73175293 1.35820425 1.08405559 0.97660218 1.31912378 0.62859619 0.94765808]]
you can just read the file in, remove [
and ]
and strip leading whitespace, and use np.genfromtxt
to read the modified string:
import io
fc = io.StringIO()
with open("fname.txt") as f:
for line in f:
fc.write(line.replace("[", "").replace("]", "").strip())
fc.write("\n")
fc.seek(0)
arr = np.genfromtxt(fc, delimiter=" ")