I am trying to convert a CSV into a dataframe and also updating column values in the CSV. But the issue I am facing is I am not getting rid of the index column as a result I am getting an extra index column without name in the console as follows.
fsym_id factset_entity_id ... is_substituted is_current
0 VVG1JM-S ABCXYZ-Z ... True False
If you see below, there is no column name for 0. If I try dropping first column in the dataframe using the following code line, df.drop(columns = df.columns[0], axis = 1, inplace= True)
, it drops the fsym_id column which I need. Below is the code.
def update_run_id_in_csv(rds_db_conn,test_case_name,file_name):
df = pd.read_csv("{}/output/Float_Ingestion_Expected_Output_files/{}/{}.csv".format(str(parentDir), test_case_name, file_name))
df['run_id'] = '2323323232999'
#get_run_id(rds_db_conn,max_13f_query,query)
df.drop(columns = df.columns[0], axis = 1, inplace= True)
print(df)
There is no index column in the CSV. I am not able to understand how it gets added while updating the column run id in the data frame. How to get rid of the index column?
CodePudding user response:
As far as I know, you cannot get rid of index in the pandas dataframe. (Index is not considered as column)
However, when you convert dataframe into csv, you can skip indicies like below.
df.to_csv(path, index = False)
CodePudding user response:
You can try this because while you read the csv column unnamed 0 contains your previous index.
df = pd.read_csv('file_name.csv).drop(['unnamed 0'],axis=1)
or
df.drop(['unnamed 0'], axis = 1, inplace= True)
It's better to do this is specify pd.read_csv(..., index_col=[0], and avoid the extra "drop" call.