Hi I am trying to create multiple csv files from a single big csv using python. The original csv file has multiple stocks data in 1 min date/time with Open, high, low, close, volume as other columns.
Sample data from original file is here
At first, I tried to copy individual Ticker and all its corresponding values to a new file with following code:
import pandas as pd
excel_file_path=r'C:\Users\mahan\Documents\test projects\01_07_APR_WEEKLY_expiry_data_VEGE_NF_AND_BNF_Options_Desktop_Vege.csv'
export_path=r"C:\Users\mahan\Documents\exportfiles\{output_file_name}_sheet.csv"
data= pd.read_csv(excel_file_path, index_col="Ticker") #Making data frame from csv file
rows= data.loc[['NIFTYWK17500CE']] #Retrieving rows by loc method
output_file_name ="NIFTYWK17500CE_"
print(type(rows))
rows
rows.to_csv(export_path)
Result was something like this: a file was saved with the name "{output_file_name}__sheet.csv"
I failed at naming the file but data was copied pertaining to all the values with Ticker value 'NIFTYWK17500CE'.
Then I tried to create a array with column "Ticker" to find unique values. Created a dataframe with original file for all the data. And tried to use a For loop for values in the array matching the 1st column 'Ticker' and copy those data to a new file using the value in the exporting csv file name.
code as below:
import pandas as pd
excel_file_path=r'C:\Users\mahan\Documents\test projects\01_07_APR_WEEKLY_expiry_data_VEGE_NF_AND_BNF_Options_Desktop_Vege.csv'
df2=pd.read_csv(excel_file_path)
df2_uniques =df2['Ticker'].unique()
df2_counts=df2['Ticker'].value_counts()
for value in df2_uniques:
value=value.replace(' ', '_')
export_path=r"C:\Users\mahan\Documents\exportfiles\{value}__sheet.csv"
df=pd.read_csv(excel_file_path,index_col="Ticker")
rows=df.loc[['value']]
print(type(rows))
rows.to_csv(export_path)
Received an error:
KeyError: "None of [Index(['value'], dtype='object', name='Ticker')] are in the [index]"
Where did I went wrong:
- In naming the file properly to save in earlier code.
- In the second code.
Any help is really appreciated. Thanks in advance.
SOLVED
What worked for me was the following with comments:
import pandas as pd
excel_file_path=r'C:\Users\mahan\Documents\test projects\01_07_APR_WEEKLY_expiry_data_VEGE_NF_AND_BNF_Options_Desktop_Vege.csv'
df2=pd.read_csv(excel_file_path)
df2_uniques =df2['Ticker'].unique()
for value in df2_uniques:
value=value.replace(' ', '_')
df=pd.read_csv(excel_file_path,index_col="Ticker")
rows=df.loc[[value]] #Changed from 'value' to value
print(type(rows))
rows.to_csv(r'_' value '.csv')
#Removed export_path as filename and filepath together were giving me hard time to figure out.
#The files get saved in same filepath as the original imported filepath. So that'll do. sharing just for reference
CodePudding user response:
I can't know for sure without seeing the dataframe, but the error indicates that there is no column name 'Ticker'
. It appears that you set this column to be the index, so you can try df2_uniques = set(df2.index)
.
CodePudding user response:
changed
rows=df.loc[['value']] to rows=df.loc[[value]]
Also, Removed export_path as both filename and filepath together were giving me hard time to figure out. The files get saved in same filepath as the original imported filepath. So that'll do. Sharing just for reference
Final code that worked looked like this:
import pandas as pd
excel_file_path=r'C:\Users\mahan\Documents\test projects\01_07_APR_WEEKLY_expiry_data_VEGE_NF_AND_BNF_Options_Desktop_Vege.csv'
df2=pd.read_csv(excel_file_path)
df2_uniques =df2['Ticker'].unique()
for value in df2_uniques:
value=value.replace(' ', '_')
df=pd.read_csv(excel_file_path,index_col="Ticker")
rows=df.loc[[value]] #Changed from 'value' to value
print(type(rows))
rows.to_csv(r'_' value '.csv')