A Python question. I have a problem. There is a formatted table below (the starts are for more attentions and not really in table):
Step Time Apple_price fluctuation
BFGS: 0 18:21:43 -6442.333161 7.4744
BFGS: 1 18:21:43 *-6442.899477 5.8484*
Step Time Apple_price fluctuation
BFGS: 0 18:21:53 -6441.911200 16.3190
BFGS: 1 18:21:53 -6442.540975 10.6048
BFGS: 2 18:21:53 -6443.107163 7.6685
BFGS: 3 18:21:53 -6443.565044 6.2186
BFGS: 4 18:21:54 *-6443.954663 5.7485*
Step Time Apple_price fluctuation
BFGS: 0 18:27:00 -6440.611426 24.6802
BFGS: 1 18:27:00 -6441.602767 21.3009
BFGS: 2 18:27:00 -6442.446886 15.6698
BFGS: 3 18:27:01 -6443.084822 11.6312
BFGS: 4 18:27:01 -6443.582671 8.6795
BFGS: 5 18:27:01 -6444.019236 7.4906
BFGS: 6 18:27:01 -6444.389951 6.7435
BFGS: 7 18:27:02 *-6444.732455 6.5221*
I would like to extract the values between "*" as follows:
-6442.899477 5.8484
-6443.954663 5.7485
-6444.732455 6.5221
my code is as follows:
import pandas as pd
import numpy as np
all_lines = []
file_name = input("What's the file name with extension?: ")
with open (f'{file_name}', 'r') as file:
for each_line in file:
all_lines.append(each_line.strip())
#print(all_lines)
for j in all_lines:
if j == 0:
j = j 1
if 'fluctuation' in i:
all_lines.index(j-1)
print(j)
Unfortunately, the output is only the first line of answer:
-6442.899477 5.8484
Let me know how it can extract values of indices in some lists
CodePudding user response:
Import Regular Expression
import re
Preparing data:
text = """ Step Time Apple_price fluctuation
BFGS: 0 18:21:43 -6442.333161 7.4744
BFGS: 1 18:21:43 *-6442.899477 5.8484*
Step Time Apple_price fluctuation
BFGS: 0 18:21:53 -6441.911200 16.3190
BFGS: 1 18:21:53 -6442.540975 10.6048
BFGS: 2 18:21:53 -6443.107163 7.6685
BFGS: 3 18:21:53 -6443.565044 6.2186
BFGS: 4 18:21:54 *-6443.954663 5.7485*
Step Time Apple_price fluctuation
BFGS: 0 18:27:00 -6440.611426 24.6802
BFGS: 1 18:27:00 -6441.602767 21.3009
BFGS: 2 18:27:00 -6442.446886 15.6698
BFGS: 3 18:27:01 -6443.084822 11.6312
BFGS: 4 18:27:01 -6443.582671 8.6795
BFGS: 5 18:27:01 -6444.019236 7.4906
BFGS: 6 18:27:01 -6444.389951 6.7435
BFGS: 7 18:27:02 *-6444.732455 6.5221*"""
Define regular expression: between * what characters may contain
p = re.compile(r'\*[- 0-9.]*\*')
Match regular expression and text
a = p.findall(text)
a: array of matches. Enumerate retrieves index and content:
for k, v in enumerate(a):
print(k, v)
Output:
0 -6442.899477 5.8484 1 -6443.954663 5.7485 2 -6444.732455 6.5221
CodePudding user response:
Unfortunately, I cannot explain well. The stars are not in table. I put them only for showing what data I would like to print. Please remove stars and re-help. Bests