Home > other >  How do I load a matrix from a .txt file in python?
How do I load a matrix from a .txt file in python?

Time:01-06

I have a matrix that looks like this in a txt file:

[[0.26263508 0.89992943 0.62171512 0.20750958 0.21195397 0.97217826
  0.61573457 0.05643889]
 [0.33188798 0.32016444 0.92051048 0.75572024 0.20247452 0.37400282
  0.10935296 0.63343081]
 [0.87017165 0.7283508  0.80314653 0.80094718 0.74024014 0.16330332
  0.76579785 0.75177055]
 [0.2629302  0.59727507 0.60866212 0.29746334 0.54587234 0.43876005
  0.75007362 0.89742691]
 [0.05300406 0.83342629 0.19291691 0.83372532 0.98122163 0.7815009
  0.59635085 0.9700382 ]
 [0.69259902 0.42779514 0.04766533 0.62205107 0.71423376 0.85045446
  0.31985818 0.15338853]
 [0.26947509 0.41946874 0.87206754 0.35849082 0.94756447 0.59001803
  0.41028535 0.85643487]
 [0.87299386 0.70986812 0.87212445 0.30309828 0.31214338 0.33387522
  0.52875374 0.75712628]
 [0.51605143 0.64374971 0.37821579 0.77055732 0.12504581 0.75814223
  0.87462081 0.97378988]
 [1.27346865 0.73175293 1.35820425 1.08405559 0.97660218 1.31912378
  0.62859619 0.94765808]]

When I try to read it into a program using

inputMatrix = np.loadtxt("testing789.txt", dtype = 'i' , delimiter=' ')               
print(inputMatrix)

My problem is that the [ and ] in the file are strings that cannot be converted to int32. Is there an efficient way to read in this matrix?

CodePudding user response:

Instead of writing the matrix to a file like this: myFile.write(str(matrix)),

Write it like this to automatically have it formatted: np.savetxt(fileName.txt, matrix)

One last thing: Load the matrix from the txt file like so:

inputMatrix = np.loadtxt("testing789.txt", dtype = 'f' , delimiter=' ')

Where dtype = 'f' is used instead of i so that the matrix values are not rounded.

CodePudding user response:

The correct answer here is to fix your source file, if you can, to be a standard format such as what is written by np.savetxt. Then, you can read it simply by np.loadtxt. If that is not possible, read on.


Since your rows seem to be divided over multiple lines, my initial suggestion (just remove [ and ], and then parse) won't work. Instead, you could add commas in the correct places so that the string is a valid python list-of-lists, and use ast.literal_eval to evaluate it. Then, build a numpy array out of this list of lists.

import ast
import re

with open("fname.txt") as f:
    fc = re.sub("  ", ", ", f.read())

lst = ast.literal_eval(fc)
arr = np.array(lst)

The regular expression " " simply matches one or more spaces, which are replaced by a comma.

Now arr is the expected array:

array([[0.26263508, 0.89992943, 0.62171512, 0.20750958, 0.21195397,
        0.97217826, 0.61573457, 0.05643889],
       [0.33188798, 0.32016444, 0.92051048, 0.75572024, 0.20247452,
        0.37400282, 0.10935296, 0.63343081],
       [0.87017165, 0.7283508 , 0.80314653, 0.80094718, 0.74024014,
        0.16330332, 0.76579785, 0.75177055],
       [0.2629302 , 0.59727507, 0.60866212, 0.29746334, 0.54587234,
        0.43876005, 0.75007362, 0.89742691],
       [0.05300406, 0.83342629, 0.19291691, 0.83372532, 0.98122163,
        0.7815009 , 0.59635085, 0.9700382 ],
       [0.69259902, 0.42779514, 0.04766533, 0.62205107, 0.71423376,
        0.85045446, 0.31985818, 0.15338853],
       [0.26947509, 0.41946874, 0.87206754, 0.35849082, 0.94756447,
        0.59001803, 0.41028535, 0.85643487],
       [0.87299386, 0.70986812, 0.87212445, 0.30309828, 0.31214338,
        0.33387522, 0.52875374, 0.75712628],
       [0.51605143, 0.64374971, 0.37821579, 0.77055732, 0.12504581,
        0.75814223, 0.87462081, 0.97378988],
       [1.27346865, 0.73175293, 1.35820425, 1.08405559, 0.97660218,
        1.31912378, 0.62859619, 0.94765808]])

If you do not have rows divided over multiple lines (i.e. if your file looked like so:)

[[0.26263508 0.89992943 0.62171512 0.20750958 0.21195397 0.97217826 0.61573457 0.05643889]
 [0.33188798 0.32016444 0.92051048 0.75572024 0.20247452 0.37400282 0.10935296 0.63343081]
 [0.87017165 0.7283508 0.80314653 0.80094718 0.74024014 0.16330332 0.76579785 0.75177055]
 [0.2629302 0.59727507 0.60866212 0.29746334 0.54587234 0.43876005 0.75007362 0.89742691]
 [0.05300406 0.83342629 0.19291691 0.83372532 0.98122163 0.7815009 0.59635085 0.9700382 ]
 [0.69259902 0.42779514 0.04766533 0.62205107 0.71423376 0.85045446 0.31985818 0.15338853]
 [0.26947509 0.41946874 0.87206754 0.35849082 0.94756447 0.59001803 0.41028535 0.85643487]
 [0.87299386 0.70986812 0.87212445 0.30309828 0.31214338 0.33387522 0.52875374 0.75712628]
 [0.51605143 0.64374971 0.37821579 0.77055732 0.12504581 0.75814223 0.87462081 0.97378988]
 [1.27346865 0.73175293 1.35820425 1.08405559 0.97660218 1.31912378 0.62859619 0.94765808]]

you can just read the file in, remove [ and ] and strip leading whitespace, and use np.genfromtxt to read the modified string:

import io

fc = io.StringIO()
with open("fname.txt") as f:
    for line in f:
        fc.write(line.replace("[", "").replace("]", "").strip())
        fc.write("\n")

fc.seek(0)
arr = np.genfromtxt(fc, delimiter=" ")
  • Related