How do I make a table for historic data?-CodePudding

How do I make a dataset that shows historic data from snapshots?

I have a csv-file that is updated and overwritten with new snapshot data once a day. I would like to make a python-script that regularly updates the snapshot data with the current snapshots.

One way I thought of was the following:

import pandas as pd

# Read csv-file
snapshot = pd.read_csv('C:/source/snapshot_data.csv')

# Try to read potential trend-data
try:
    historic = pd.read_csv('C:/merged/historic_data.csv')
    # Merge the two dfs and write back to historic file-path
    historic.merge(snapshot).to_csv('C:/merged/historic_data.csv')

except:
    snapshot.to_csv('C:/merged/historic_data.csv')

However, I don't like the fact that I use a try-function to get the historic data if the file-path exists or write the snapshot data to the historic path if the path doesn't exist. Is there anyone that knows a better way of creating a trend dataset?

CodePudding user response：

You can use os module to check if the file exists and mode argument in to_csv function to append data to the file.

The code below will:

Read from snapshot.csv.
Checks if the historic.csv file exists.
If it exists then save the headers else dont save header.
Save the file. If the file already exists, new data will be appended to the file instead of overwriting it.

import os
import pandas as pd

# Read snapshot file
snapshot = pd.read_csv("snapshot.csv")

# Check if historic data file exists
file_path = "historic.csv"
header = not os.path.exists(file_path)  # whether header needs to written

# Create or append to the historic data file
snapshot.to_csv(file_path, header=header, index=False, mode="a")

CodePudding user response：

you could easily one line it by utilising the mode parameter in `to_csv'.

pandas.read_csv('snapshot.csv').to_csv('historic.csv', mode='a')

It will create the file if it doesn't already exist, or will append if it does.

What happens if you don't have a new snapshot file? You might want to wrap that in a try... except block. The pythonic way is typically ask for forgiveness instead of permission.

I wouldn't even both with an external library like pandas as the standard library has all you need to 'append' to a file.

with open('snapshot.csv', 'r') as snapshot:
    with open('historic.csv', 'a') as historic:
        for line in new_file.readline():
            historic_file.write(line)