There are several types of Commands
in the third column of the text file. So, I am using the regular expression method to grep the number of occurrences for each type of command.
For example, ACTIVE
has occurred 3 times, REFRESH
2 times. I desire to enhance the flexibility of my program. So, I wish to assign the time for each command.
Since one command can happen more than 1 time, if the script supports the command being assigned to the time, then the users will know which ACTIVE
occurs at what time. Any guidance or suggestions are welcomed.
The idea is to have more flexible support for the script.
My code:
import re
a = a_1 = b = b_1 = c = d = e = 0
lines = open("page_stats.txt", "r").readlines()
for line in lines:
if re.search(r"WRITING_A", line):
a_1 = 1
elif re.search(r"WRITING", line):
a = 1
elif re.search(r"READING_A", line):
b_1 = 1
elif re.search(r"READING", line):
b = 1
elif re.search(r"PRECHARGE", line):
c = 1
elif re.search(r"ACTIVE", line):
d = 1
File content:
-----------------------------------------------------------------
| Number | Time | Command | Data |
-----------------------------------------------------------------
| 1 | 0015 | ACTIVE | |
| 2 | 0030 | WRITING | |
| 3 | 0100 | WRITING_A | |
| 4 | 0115 | PRECHARGE | |
| 5 | 0120 | REFRESH | |
| 6 | 0150 | ACTIVE | |
| 7 | 0200 | READING | |
| 8 | 0314 | PRECHARGE | |
| 9 | 0318 | ACTIVE | |
| 10 | 0345 | WRITING_A | |
| 11 | 0430 | WRITING_A | |
| 12 | 0447 | WRITING | |
| 13 | 0503 | PRECHARGE | |
| 14 | 0610 | REFRESH | |
CodePudding user response:
Assuming you want to count the occurrences of each command and store the timestamps of each command as well, would you please try:
import re
count = {}
timestamps = {}
with open ("page_stats.txt", "r") as f:
for line in f:
m = re.split(r"\s*\|\s*", line)
if len(m) > 3 and re.match(r"\d ", m[1]):
count[m[3]] = count[m[3]] 1 if m[3] in count else 1
if m[3] in timestamps:
timestamps[m[3]].append(m[2])
else:
timestamps[m[3]] = [m[2]]
# see the limited result (example)
#print(count["ACTIVE"])
#print(timestamps["ACTIVE"])
# see the results
for key in count:
print("%-10s: -, %s" % (key, count[key], timestamps[key]))
Output:
REFRESH : 2, ['0120', '0610']
WRITING : 2, ['0030', '0447']
PRECHARGE : 3, ['0115', '0314', '0503']
ACTIVE : 3, ['0015', '0150', '0318']
READING : 1, ['0200']
WRITING_A : 3, ['0100', '0345', '0430']
m = re.split(r"\s*\|\s*", line)
splitsline
on a pipe character which may be preceded and/or followed by blank characters.- Then the list elements
m[1]
,m[2]
,m[3]
are assingned to theNumber
,Time
,Command
in order. - The condition
if len(m) > 3 and re.match(r"\d ", m[1])
skips the header lines. - Then the dictionary variables
count
andtimestamps
are assigned, incremented or appended one by one.