date Mon Jan 4 15:59:21.129 2021
base hex timestamps absolute
no internal events logged
// version 13.0.0
//545285.973861 previous log file: Myfile_0.asc
// Measurement UUID: 4520e127-a0b6-48d2-9e23-2588160af285
545333.620639 LoggingString := "Log,11:28 PM, Sunday, January 10, 2021,11:28:17.4,34.72,12,0.01058,11.99,0.01077,12,0.01127,11.99,0.01142,11.76,0.1053,11.99,0.01076,11.96,0.01092,2.516,0,2,OM_2_1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0"
545335.691676 LoggingString := "Log,11:28 PM, Sunday, January 10, 2021,11:28:19.5,34.61,12,0.01058,11.99,0.01072,11.99,0.01127,11.99,0.01139,11.87,0.1118,12.01,0.01046,11.99,0.01145,2.581,0,2,OM_2_1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0"
545337.796715 LoggingString := "Log,11:28 PM, Sunday, January 10, 2021,11:28:21.6,34.52,11.99,0.0106,11.99,0.01077,11.99,0.01151,11.99,0.01139,11.72,0.1081,12,0.0109,11.96,0.01107,2.543,0,2,OM_2_1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0"
545339.919752 LoggingString := "Log,11:28 PM, Sunday, January 10, 2021,11:28:23.7,34.41,12,0.01082,11.99,0.01104,11.99,0.01156,11.99,0.01164,11.62,0.1042,11.99,0.01105,11.96,0.01126,2.596,0,2,OM_2_1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0"
The above text represents my input data from a log file (below available in image format too) :
I want to perform certain operations with the data shown in the image. However, i am unable to figure out a method to split each element in the line. The data starting after 11:28:17.4 is of importance for me . I have used the numpy.genfromtxt function & usecols arg to print data between columns 6 and 22..however, i wanted to be able to split each element of the row so that i could use the elements as identifiers to begin recording the important data for me.
For e.g in line 7, there is "whitespace, comma" as separators. How do i split the data so that at the end i get the following as output :
List = ['545333.620639','Logging String:=', 'Log', .........., 2021, 11:28:17.4, 34.72 .....]
Also, when i use "Readlines()", is the data stored as one complete string in the List or as individual string elements in List?
This is a more hardcoded approach to the solution i want. This gives me a .csv file at the end with specific data extracted from a larger dataset.. However, i want a better approach to this.
Instead of manually defining line number as counter to start storing data into .csv, i want to be able to define that if "// Measurement UUID:" is detected, then start storing data into .csv from next line
To be able to separate each line into individual elements
How to define multipe delimiters for "np.genfromtxt" function
import numpy as np
Testfile = open('C:/Documents/Myfile.asc','r')
Read_data = Testfile.readlines()
count = 0
for line in Read_data:
count = 1
if count < 7: ## counter to start saving data into .csv from 7th line
print("Line{}: {}".format(count, line.strip()))
else:
mydat = np.genfromtxt("C:/Documents/Myfile.asc",skip_header=(count-1),usecols= (4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23),delimiter=',')
Data_frame = pd.DataFrame(mydat)
Data_frame.to_csv("Triall_3.csv",sep=';')
exit()
CodePudding user response:
I hope I've understood your question well. You can check if there's LoggingString :=
inside the line and if is, split the string:
import pandas as pd
out = []
with open("your_file.txt", "r") as f_in:
for line in map(str.strip, f_in):
if "LoggingString :=" in line:
first_quote = line.index('"')
last_quote = line.index('"', first_quote 1)
out.append(
line[:first_quote].split(maxsplit=1)
line[first_quote 1 : last_quote].split(","),
)
df = pd.DataFrame(out)
print(df)
Prints:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
0 545333.620639 LoggingString := Log 11:28 PM Sunday January 10 2021 11:28:17.4 34.72 12 0.01058 11.99 0.01077 12 0.01127 11.99 0.01142 11.76 0.1053 11.99 0.01076 11.96 0.01092 2.516 0 2 OM_2_1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 545335.691676 LoggingString := Log 11:28 PM Sunday January 10 2021 11:28:19.5 34.61 12 0.01058 11.99 0.01072 11.99 0.01127 11.99 0.01139 11.87 0.1118 12.01 0.01046 11.99 0.01145 2.581 0 2 OM_2_1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 545337.796715 LoggingString := Log 11:28 PM Sunday January 10 2021 11:28:21.6 34.52 11.99 0.0106 11.99 0.01077 11.99 0.01151 11.99 0.01139 11.72 0.1081 12 0.0109 11.96 0.01107 2.543 0 2 OM_2_1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 545339.919752 LoggingString := Log 11:28 PM Sunday January 10 2021 11:28:23.7 34.41 12 0.01082 11.99 0.01104 11.99 0.01156 11.99 0.01164 11.62 0.1042 11.99 0.01105 11.96 0.01126 2.596 0 2 OM_2_1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0