I would like to merge 2 dataframes with different index column names. I have df1 with the proper order of tickers that i want and would like to merge df2 witch the same tickers but in a different order and additional information.
so my df1 looks like:
df1
index | XOO | TSLA | FORD | YYX | AAPL |
---|---|---|---|---|---|
avg | 24 | 51 | 25 | 35 | 34 |
std | 3 | 3 | 4 | 5 | 2 |
min | 1 | 2 | 4 | 5 | 6 |
max | 78 | 78 | 97 | 89 | 97 |
df2
Index | 10 | 11 | 62 | 454 | 988 |
---|---|---|---|---|---|
ticker | AAPL | FORD | TSLA | YYX | XOO |
sector | Tech | Auto | Tech | ETF | ETF |
My df2 has its information imported from an excel so the column names are its index numbers which I do not want. The ticker names are also in the wrong order and I need the sectors to match the tickers. The dataframe Id like to create looks like:
df3
index | XOO | TSLA | FORD | YYX | AAPL |
---|---|---|---|---|---|
avg | 24 | 51 | 25 | 35 | 34 |
std | 3 | 3 | 4 | 5 | 2 |
min | 1 | 2 | 4 | 5 | 6 |
max | 78 | 78 | 97 | 89 | 97 |
sector | ETF | Tech | Auto | ETF | Tech |
I have tried to set the index using
Industry = Industry.set_index('Index').T.set_index('ticker').T
but I get an error that says "None of ['Index'] are in the columns"
I have also tried:
df3 = pd.concat([df1,df2])
but it creates a dataframe with df2 not matching the tickers and instead put it next to df1.
CodePudding user response:
import pandas as pd
# Import the tables you provided, which I saved as CSVs
df1 = pd.read_csv('df1.csv')
df2 = pd.read_csv('df2.csv', header=1)
# Rename the "ticker" column so that all it is the same as the corresponding column name in df1
df2.rename(columns={'ticker': 'index'}, inplace=True)
# Concatenate
df3 = pd.concat([df1, df2])
df3
Two changes to make:
- Make sure you're selecting the correct row when you import
df2
. Useheader=1
if the column names are in row 1 of your CSV. - Rename the "ticker" column to "index" (so that it matches df1). That should do the trick, and the merge should work correctly.