Home > Blockchain >  Pandas concat is adding unnamed index
Pandas concat is adding unnamed index

Time:11-17

When I concat two dataframes which are both 337 columns and then export to CSV, the result become 338 columns with each time a new unnamed index being added.

df1
Out[141]: 
  datecreated     1     2     3     4     5  ...   331   332   333   334   335   336
0  2022-11-14  4000  3900  3850  3810  3790  ...  5520  5300  5180  4990  4730  4520
1  2022-11-15  4000  3900  3850  3810  3790  ...  5520  5300  5180  4990  4730  4520
[2 rows x 337 columns]
df4
Out[142]: 
  datecreated       1       2       3  ...     333     334     335     336
0  2022-11-16  4080.0  3980.0  3940.0  ...  5510.0  5290.0  4960.0  4700.0
[1 rows x 337 columns]

using the concatenation:

df5 = pd.concat([df1, df4], ignore_index=True)

and then exporting to CSV:

csv_buffer = StringIO()
df5.to_csv(csv_buffer)
file_name = 'outputs.csv'
s3_resource.Object(bucket_name, file_name).put(Body=csv_buffer.getvalue())

yields a 338 column with unnamed index after fetching the updated output file:

body = s3_client.get_object(Bucket=bucket_name, Key='outputs.csv')['Body']
contents = body.read().decode('utf-8')
df1 = pd.read_csv(StringIO(contents), parse_dates=['datecreated'])

df1
Out[134]: 
   Unnamed: 0 datecreated       1       2  ...     333     334     335     336
0           0  2022-11-14  4000.0  3900.0  ...  5180.0  4990.0  4730.0  4520.0
1           1  2022-11-15  4000.0  3900.0  ...  5180.0  4990.0  4730.0  4520.0
2           2  2022-11-16  4080.0  3980.0  ...  5510.0  5290.0  4960.0  4700.0

What is causing this?

CodePudding user response:

The unnamed index is the row index of the dataframe. If you do not want this, you can use index=False as one of the arguments such that :

d5.to_csv(csv_buffer,index=False)
  • Related