Pls help im tired. cant see why how to make it work.
Problem to solve: a .csv file has to have suposedly 1sec data like:
time,open,high,low,close,Extremum,Fib 1,Fib 2,Fib 3,l100
2022-04-03 02:00:00,3.294,3.294,3.294,3.294,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:04,3.294,3.295,3.292,3.292,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:05,3.293,3.293,3.292,3.292,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:07,3.293,3.293,3.293,3.293,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:08,3.293,3.293,3.293,3.293,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:09,3.292,3.292,3.292,3.292,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
but it hasn't. Some seconds are not present, so im making it with last seen data basicly reads a line:
sep = ','
data = line.split(sep)
and data[1] to data[9] stays the same, only data[0] changes 1 second, to fill the gap:
time,open,high,low,close,Extremum,Fib 1,Fib 2,Fib 3,l100
2022-04-03 02:00:00,3.294,3.294,3.294,3.294,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:01,3.294,3.294,3.294,3.294,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:02,3.294,3.294,3.294,3.294,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:03,3.294,3.294,3.294,3.294,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:04,3.294,3.295,3.292,3.292,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:05,3.293,3.293,3.292,3.292,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:06,3.293,3.293,3.292,3.292,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:07,3.293,3.293,3.293,3.293,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:08,3.293,3.293,3.293,3.293,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:09,3.292,3.292,3.292,3.292,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
i made the logic for this, just the .append makes me tricks.. the output is for every line in source csv file, it produces me same amount of rows in output csv file, but all records are with the same, last line of a source file, what the f, :
time,open,high,low,close,Extremum,Fib 1,Fib 2,Fib 3,l100
2022-04-03 02:00:18,3.289,3.289,3.289,3.289,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:18,3.289,3.289,3.289,3.289,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:18,3.289,3.289,3.289,3.289,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:18,3.289,3.289,3.289,3.289,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:18,3.289,3.289,3.289,3.289,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:18,3.289,3.289,3.289,3.289,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:18,3.289,3.289,3.289,3.289,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:18,3.289,3.289,3.289,3.289,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:18,3.289,3.289,3.289,3.289,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:18,3.289,3.289,3.289,3.289,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:18,3.289,3.289,3.289,3.289,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:18,3.289,3.289,3.289,3.289,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:18,3.289,3.289,3.289,3.289,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:18,3.289,3.289,3.289,3.289,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:18,3.289,3.289,3.289,3.289,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
2022-04-03 02:00:18,3.289,3.289,3.289,3.289,3.277,3.332898006846162,3.348093581522788,3.357138566449352,3.367449849265634
whyyyyyy, heres the code:
import glob
import datetime
import time
import pandas as pd
# make files sync on sec's
filenames = [i for i in glob.glob("*unique_sorted.csv")]
for filename in filenames:
coin_name = filename[0:18]
print(filename)
with open (filename, "r") as f:
x = -1
memory = {}
memory["data"] = {}
memory["data"]["time"] = {}
new_file = []
tolist = {}
tolist["total"] = {}
tolist["memory"] = {}
sep = ','
start = 1
for line in f:
data = line.split(sep)
if data[0] != "time":
x = x 1
if start == 0 and data[0][0:19] != sec_1_more:
memory_time = datetime.datetime.strptime(tolist["memory"]['time'], "%Y-%m-%d %H:%M:%S") # data[0] from previous line
read_line_time = datetime.datetime.strptime(data[0][0:19], "%Y-%m-%d %H:%M:%S") # current_line
diff = read_line_time - sec_1_more
diff_sec = diff.total_seconds()
sec = int(diff_sec)
i = 1
while i < sec:
time_for_same_data = memory_time datetime.timedelta(seconds=1) # 2:00:00 1 second
time_for_same_data_str = str(time_for_same_data)
#2022-04-03 02:00:04,1.4073,1.4073,1.4071,1.4072,1.375,1.4137077251573131,1.4242302135495926,1.4304935994973778,1.437633859477853
tolist["total"]['time'] = time_for_same_data_str
tolist["total"]['open'] = data[1]
tolist["total"]['high'] = data[2]
tolist["total"]['low'] = data[3]
tolist["total"]['close'] = data[4]
tolist["total"]['Extremum'] = data[5]
tolist["total"]['Fib 1'] = data[6]
tolist["total"]['Fib 2'] = data[7]
tolist["total"]['Fib 3'] = data[8]
tolist["total"]['l100'] = data[9].strip()
new_file.append(tolist["total"])
memory_time = memory_time datetime.timedelta(seconds=1)
i = i 1
tolist["total"]['time'] = data[0]
tolist["total"]['open'] = data[1]
tolist["total"]['high'] = data[2]
tolist["total"]['low'] = data[3]
tolist["total"]['close'] = data[4]
tolist["total"]['Extremum'] = data[5]
tolist["total"]['Fib 1'] = data[6]
tolist["total"]['Fib 2'] = data[7]
tolist["total"]['Fib 3'] = data[8]
tolist["total"]['l100'] = data[9].strip()
new_file.append(tolist["total"])
elif start == 0 and data[0][0:19] == sec_1_more:
tolist["total"]['time'] = data[0]
tolist["total"]['open'] = data[1]
tolist["total"]['high'] = data[2]
tolist["total"]['low'] = data[3]
tolist["total"]['close'] = data[4]
tolist["total"]['Extremum'] = data[5]
tolist["total"]['Fib 1'] = data[6]
tolist["total"]['Fib 2'] = data[7]
tolist["total"]['Fib 3'] = data[8]
tolist["total"]['l100'] = data[9].strip()
new_file.append(tolist["total"])
memory["data"]["data"] = str(line)
memory["data"]["time"] = str(data[0][0:19])
tolist["memory"]['time'] = data[0]
tolist["memory"]['open'] = data[1]
tolist["memory"]['high'] = data[2]
tolist["memory"]['low'] = data[3]
tolist["memory"]['close'] = data[4]
tolist["memory"]['Extremum'] = data[5]
tolist["memory"]['Fib 1'] = data[6]
tolist["memory"]['Fib 2'] = data[7]
tolist["memory"]['Fib 3'] = data[8]
tolist["memory"]['l100'] = data[9].strip()
#t = "2022-04-03 02:00:04"
t = datetime.datetime.strptime(memory["data"]["time"], "%Y-%m-%d %H:%M:%S")
#sec_1_more = (t datetime.timedelta(seconds=1)).strftime("%Y-%m-%d %H:%M:%S")
#or
sec_1_more = t datetime.timedelta(seconds=1)
if start == 1:
new_file.append(tolist["memory"])
start = 0
if x == 10:
#print(new_file)
#quit()
break # for test to see only 10 first
f.close() # needed
csvData = pd.DataFrame(new_file)
csvData.to_csv(coin_name "_unique_sorted_synced.csv", mode="w", index=False)
quit() # coz just one file processing for testing
CodePudding user response:
Looks like you can just read in the data, and use asfreq
:
# instead of read_clipboard, you'd read it with pd.read_csv
df = pd.read_clipboard(sep=',', parse_dates = ['time'])
df.set_index('time').asfreq(freq='1S').ffill()
open high low close Extremum Fib 1 Fib 2 Fib 3 l100
time
2022-04-03 02:00:00 3.294 3.294 3.294 3.294 3.277 3.332898 3.348094 3.357139 3.36745
2022-04-03 02:00:01 3.294 3.294 3.294 3.294 3.277 3.332898 3.348094 3.357139 3.36745
2022-04-03 02:00:02 3.294 3.294 3.294 3.294 3.277 3.332898 3.348094 3.357139 3.36745
2022-04-03 02:00:03 3.294 3.294 3.294 3.294 3.277 3.332898 3.348094 3.357139 3.36745
2022-04-03 02:00:04 3.294 3.295 3.292 3.292 3.277 3.332898 3.348094 3.357139 3.36745
2022-04-03 02:00:05 3.293 3.293 3.292 3.292 3.277 3.332898 3.348094 3.357139 3.36745
2022-04-03 02:00:06 3.293 3.293 3.292 3.292 3.277 3.332898 3.348094 3.357139 3.36745
2022-04-03 02:00:07 3.293 3.293 3.293 3.293 3.277 3.332898 3.348094 3.357139 3.36745
2022-04-03 02:00:08 3.293 3.293 3.293 3.293 3.277 3.332898 3.348094 3.357139 3.36745
2022-04-03 02:00:09 3.292 3.292 3.292 3.292 3.277 3.332898 3.348094 3.357139 3.36745