Home > Software engineering >  Concat multiple csv files with common date column
Concat multiple csv files with common date column

Time:11-20

I have adjusted my csv files of different stocks to have the date column as index and the adj close price as another column. Then I renamed my adj close price as the filename, so different type of stock labels.

from google.colab import drive
drive.mount('/content/drive') 
import numpy as np
import os
import pandas as pd

os.chdir("/content/drive/My Drive")
path = "/content/drive/MyDrive/Stock"
files=os.listdir(path)

df=pd.DataFrame()

for file in files:
  if file.endswith(".csv"):
    columns = ['Adj Close', 'Date']
    df_temp = pd.read_csv(file,  usecols=columns, index_col='Date')
    df_temp = df_temp.rename(columns={'Adj Close': file})
    df = pd.concat([df_temp], ignore_index=False, axis=1)
    display(df)

Then I tried to concat the stock files However, the result was still multiple csv files. How can I concat the file so that the final file looks like this?

         |     Stock A    |     Stock B    |
Date     |     --------   |    --------    |
date1    | Adj Close 1A   | Adj Close 1B   |
date2    | Adj Close 2A   | Adj Close 2B   |

CodePudding user response:

From your code what I see is in the following line your concat syntax is wrong.

Instead of

df = pd.concat([df_temp], ignore_index=False, axis=1)

You should also pass df in while concatenating:

df = pd.concat([df,df_temp], ignore_index=False, axis=1)

And for more information please share some sample data of the input and the output data so that we can understand them properly.

Thanks!

CodePudding user response:

I would pd.concat outside of the for loop, which would, instead, be used to create a Python list to pass into pd.concat.

Also, initialising an empty DataFrame using df = pd.DataFrame() is not required.

Full Code

from google.colab import drive
drive.mount('/content/drive') 
import numpy as np
import os
import pandas as pd

os.chdir("/content/drive/My Drive")
path = "/content/drive/MyDrive/Stock"
files = os.listdir(path)

df_list = []
for file in files:
    if file.endswith(".csv"):
        columns = ['Adj Close', 'Date']
        df_temp = pd.read_csv(file,  usecols=columns, index_col='Date')
        df_temp = df_temp.rename(columns={'Adj Close': file})
        df_list.append(df_temp)

df = pd.concat(df_list, ignore_index=False, axis=1)

display(df)
  • Related