I wrote a code to update particular column values in the CSV through pandas data frame. After the code execution, what I see is an extra column added at the start. This comma causes a misalignment of my CSV structure. For e.g. I updated age column value in the CSV as 30 which was 26 earlier for each of the rows, what I see in the notepad is as follows.
,Name,Age,Gender
Pratik,30,Male
Sarvesh,30,Male
If you see at the start of the Header column Name, an extra column is been added. How to remove or restrict that? Below is my code.
df = pd.read_csv("{}/output/Float_Ingestion_Expected_Output_files/{}/{}.csv".format(str(parentDir), test_case_name, file_name,header=None))
#for x in df:
df['age'] = '2323323232444'
df.to_csv("{}/output/Float_Ingestion_Expected_Output_files/{}/{}.csv".format(str(parentDir), test_case_name, file_name, index=False))
print(df)
CodePudding user response:
If fixing the original CSV file is not an option, relabel the columns and drop the last one:
df.rename(columns=dict(zip(df.columns, df.columns[1:]))).dropna(axis=1)
# Name Age Gender
#0 Pratik 30 Male
#1 Sarvesh 30 Male
CodePudding user response:
When you save the csv, try setting your index=False. That should do the trick.
Pandas is likely doing something that looks like this to your data:
Index, Name, Age, Gender