I have multiple .csv files with different names like ATUL.csv, ISEC.csv, XYZ.csv and so on... Every file has similar data format mentioned below:
datetime symbol open high low close volume
2005-03-10 09:15:00 NSE:ATUL 85.59 89.00 85.19 86.84 73582
2005-03-11 09:15:00 NSE:ATUL 89.44 89.80 85.50 85.94 153945
2005-03-14 09:15:00 NSE:ATUL 86.90 88.75 84.00 84.65 73539
2005-03-15 09:15:00 NSE:ATUL 85.00 85.94 82.00 82.40 79053
I want to create a new .txt for every .csv file with formatting given below:
- remove 'symbol' column
- remove header row i.e. datetime|symbol|open.....
- change 'datetime' column date and time format to YYYYMMDD
- separate each column with ";"
Desired format sample given below for understanding.
20050310;85.59;89.00;85.19;86.84;73582
20050311;89.44;89.80;85.50;85.94;153945
20050314;86.90;88.75;84.00;84.65;73539
20050315;85.00;85.94;82.00;82.40;79053
I am trying below code to get my output
import pandas as pd
data = pd.read_csv('ATUL.csv')
data.drop('symbol', inplace=True, axis=1)
data['datetime'] = pd.to_datetime(data['datetime']).dt.strftime('%Y%m%d')
data.to_csv('output.txt', sep=';', header=None, index=False)
Above code is working great but its creating/converting one file at a time, my problem is that above with above code I am able to convert only one file at once, I have more than 500 files that I have to update on daily basis.
Is there any way or code that I could convert all my .csv files to text files with desired formatting (as motioned above code)
CodePudding user response:
I'm not a Python expert, there is my solution for your problem.
First pack all your file in one single directory, then use os
module to browse into it.
EDIT
import os
import pandas as pd
path = "path/to/directory/with/csv"
with os.scandir(path) as it:
for entry in it:
if entry.is_file() and entry.name.endswith(".csv"): # Treat only csv files
# Convert csv to txt with your code here
data = pd.read_csv(entry.name)
data.drop('symbol', inplace=True, axis=1)
data['datetime'] = pd.to_datetime(data['datetime']).dt.strftime('%Y%m%d')
data.to_csv('output.txt', sep=';', header=None, index=False)
This should work if you replace path value with the actual path to directory with your files on your computer.
CodePudding user response:
Here's code that does exactly what I suggested in my earlier comment.
It makes use of the built-in pathlib
module to simplify processing.
import pandas as pd
from pathlib import Path
def export_csv(input_filepath, output_filepath):
"""Reformat input file and save result to the given output file path."""
data = pd.read_csv(input_filepath)
data.drop('symbol', inplace=True, axis=1)
data['datetime'] = pd.to_datetime(data['datetime']).dt.strftime('%Y%m%d')
data.to_csv(output_filepath, sep=';', header=None, index=False)
folderpath = Path('path/to/csv/files/folder').resolve()
new_suffix = '.txt'
# Convert all .csv files in given folder.
for input_filepath in folderpath.glob('*.csv'):
# Output file path is the same as the input file except it has a different
# extension.
output_filepath = input_filepath.parent / (input_filepath.stem new_suffix)
export_csv(input_filepath, output_filepath) # Convert the file.