Home > Blockchain >  Problems reading files with letters and numbers
Problems reading files with letters and numbers

Time:09-08

I'm trying to work with a file that contains a header a set of numbers separated by double space and some text at the end (as shown in the image below).

I'm using this as sample.txt Check Pastebin here.

So the code goes:

f = open("sample.txt","r")
file_lines = f.read().splitlines()
header_lines = file_lines[1]
# split takes a separator as first argument
headers = [k for k in header_lines.split("  ")]
numbers_line = file_lines[2]
# strip remove spaces from the start and end "                1  2 3"
numbers_line = numbers_line.strip().split("  ")
# in my example data starts at 4th line and ends at 8th line (inclusive)
data_line_start = 4
data_line_end = 8
data_lines = file_lines[data_line_start-1:data_line_end]
# format data_lines remove spaces from start and end
data_lines = [j.strip() for j in data_lines]
# data_lines => DATA LINES
# ['0.03592  0.04902  0.0248  0.0327  0.0520  0.0318', '0.0553  0.06602  0.0548  0.0232  0.0710  0.0782', '0.08413  0.04402  0.0348  0.0654  0.0612  0.0428', '0.0543  0.06202  0.0148  0.0732  0.0810  0.0882', '0.0443  0.04102  0.0343  0.0556  0.0652  0.0928']
# we still need to format this using doble space as separator
data_array = []
for data_line in data_lines:
    data_line_formatted = [float(k) for k in data_line.split("  ")]
    data_array.append(data_line_formatted)
print("HEADERS")
print(headers)
print("NUMBERS LINE")
print(numbers_line)
print("DATA ARRAY")
print(data_array)

OUTPUT:

HEADERS
['Plate:', 'PLate1', '1.3', 'PlateFormat', 'EndPoint', 'Absorbance', 'Reduced', 'FALSE', '1', '1', '410', '1', '12', '96', '1', '5']
NUMBERS LINE
['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12']
DATA ARRAY
[[0.03592, 0.04902, 0.0248, 0.0327, 0.052, 0.0318], [0.0553, 0.06602, 0.0548, 0.0232, 0.071, 0.0782], [0.08413, 0.04402, 0.0348, 0.0654, 0.0612, 0.0428], [0.0543, 0.06202, 0.0148, 0.0732, 0.081, 0.0882], [0.0443, 0.04102, 0.0343, 0.0556, 0.0652, 0.0928]]

You can use the open() function to open a file, then get a list of line files and storing into file_lines variable, what's next is just using some python string methods to format the data. The script below might not be useful but you can adapt it to your needs. Let me know if it helped you.

  • Related