Home > OS >  How to split and read raw data into different numpy arrays based on delimeter parameter
How to split and read raw data into different numpy arrays based on delimeter parameter

Time:12-07

I have a raw data in the following form

#######
#######
#col1 #col2 #col3
1       10    100
2       11    150
3       14    155
#######
#######
#######
#######
#col1 #col2 #col3
1       14    100
2       17    180
3       14    155
#######
#######
#######
#######
#col1 #col2 #col3
1       19    156
2       27    130
3       24    152
#######
#######

I want to load this data into a NumPy array. When I load this using numpy.loadtxt the entire data is being loaded into a single array. Is there an easier way to split this data into different chunks based on the ####### lines?

CodePudding user response:

A simple way to do it would be to read the file, split the obtained string at the separators, clean the remaining unnecessary lines and use numpy.loadtext on these lists of strings. (As explained in the documentation, lists of strings as parameters in numpy.loadtext are treated as lines)

import numpy as np
from typing import List

filename: str = "data_file.txt" # Put your filename here instead

with open(filename, "r", encoding="utf-8") as file:
    content: str = file.read()

datas: List[str] = content.split(4 * "#######\n")
arrays: List[np.ndarray] = []
for data in datas:
    data_list: List[str] = data.replace("#######\n", "").split("\n")
    arrays.append(np.loadtxt(data_list))
  • Related