Trying to save list data to reuse while testing python scripts-CodePudding

I have a python script that goes out and pulls a huge chunk of JSON data and then iterates it to build 2 lists

# Get all price data
response = c.get_price_history_every_minute(symbol)

# Build prices list
prices = list()
for i in range (len(response.json()["candles"])):
    prices.append (response.json()["candles"][i]["prices"])

# Build times list
times = list()
for i in range (len(response.json()["candles"])):
    times.append (response.json()["candles"][i]["datetime"])

This works fine, but it takes a LONG time to pull in all of the data and build the lists. I am doing some testing trying to build out a complex script, and would like to save these two lists to two files, and then import the data from those files and recreate the lists when I run subsequent tests to skip generating, iterating and parsing the JSON.

I have been trying the following:

# Write Price to a File
a_file = open("prices7.txt", "w")
content = str(prices)
a_file.write(content)
a_file.close()

And then in future scripts:

# Load Prices from File
prices_test = array('d')
a_file = open("prices7.txt", "r")
prices_test = a_file.read()

The outputs from my json lists and the data loaded into the list created from the file output look identical, but when I try to do anything with the data loaded from a file it is garbage...

print (prices)
{The output looks like this} [69.73, 69.72, 69.64, ... 69.85, 69.82, etc]
print (prices_test)  
The output looks identical

If I run a simple query like:

print (prices[1], prices[2])
I get the expected output {69.73, 69.72]

If I do the same on the list created from the file:

print (prices_test[1], prices_test[2])
I get the output ( [,6 )

It is pulling every character in the string individually instead of using the comma separated values as I would have expected...

I've googled every combination of search terms I could think of so any help would be GREATLY appreciated!!

CodePudding user response：

I had to do something like this before. I used pickle to do it.

import pickle

def pickle_the_data(pickle_name, list_to_pickle):
    """This function pickles a given list.

    Args:
        pickle_name (str): name of the resulting pickle.
        list_to_pickle (list): list that you need to pickle
    """
    with open(pickle_name  '.pickle', 'wb') as pikd:
        pickle.dump(list_to_pickle, pikd)
        file_name = pickle_name   '.pickle'
        print(f'{file_name}: Created.')

def unpickle_the_data(pickle_file_name):
    """This will unpickle a pickled file

    Args:
        pickle_file_name (str): file name of the pickle

    Returns:
        list: when we pass a pickled list, it will return an
        unpickled list.
    """
    with open(pickle_file_name, 'rb') as pk_file:
        unpickleddata = pickle.load(pk_file)
    return unpickleddata

so first pickle your list pickle_the_data(name_for_pickle, your_list) then when you need to load the list unpickle_the_data(name_of_your_pickle_file)

CodePudding user response：

This is what I'm trying to explain into the comments section. Note I replaced response.json() to jsonData, successfully taking it out of each for-loop, and reduced both loops into a single one for more efficiency. Now the code should run faster.

import json

def saveData(filename, data):
    # Convert Data to a JSON String
    data = json.dumps(data)

    # Open the file, then save it
    try:
        file = open(filename, "wt")
    except:
        print("Failed to save the file.")
        return False
    else:
        file.write(data)
        file.close()
        return True

def loadData(filename):
    # Open the file, then load its contents
    try:
        file = open(filename, "wt")
    except:
        print("Failed to load the file.")
        return None
    else:
        data = file.read()
        file.close()

    # Data is a JSON string, so now we convert it back
    # to a Python Structure:
    data = json.loads(data)

    return data

# Get all price data
response = c.get_price_history_every_minute(symbol)
jsonData = response.json()

# Build prices and times list:
#
# As you're iterating over the same "candles" index on both loops
# when building those two lists, just reduce it to a single loop
prices = list()
times = list()
for i in range(len(jsonData["candles"])):
    prices.append(jsonData["candles"][i]["prices"])
    times.append(jsonData["candles"][i]["datetime"])

# Now, when you need, just save each list like this:
saveData("prices_list.json", prices)
saveData("times_list.json", times)


# And retrieve them back when you need it later:
prices = loadData("prices_list.json")
times = loadData("times_list.json")

Btw, pickle does the same thing, but it uses Binary Data instead of json, which is probably faster for save / load data. I don't know, didn't tested it.

In json, you have the advantage of readability, as you can open each file and read it directly, if you can understand JSON syntax.