Home > Enterprise >  Trying to access to specific columns in a multi-indexed dataframe but am getting a length mismatch e
Trying to access to specific columns in a multi-indexed dataframe but am getting a length mismatch e

Time:12-24

I am trying to create a function that will allow me to create a new data frame froms two specific columns in a multi-indexed data frame. However, I am getting this error "ValueError: Length mismatch: Expected axis has 0 elements, new values have 3 elements" From my understanding it is telling me that I have created a data frame that has zero columns, but I have created the data frame with the 3 columns.

I am not sure where I am going wrong. Here is my code:

def pairs(ticker1, ticker2):
    pairs = pd.DataFrame()
    pairs.columns = ['Date', ticker1, ticker2]
    pairs = data_df.loc[data_df['Ticker'] == ticker1]
    pairs = pairs.merge(data_df.loc[data_df['Ticker'] == ticker2], on='Date')
    return pairs

Here is a picture of the data frame I am trying to get the data from: data frame

I have tried using something like

pairs = data.df[(data_df.ticker1.isin([ticker1,ticker2])) & (data_df.ticker2.isin([ticker1,ticker2]))]

and couldn't get it to work either. I probably am making a very obvious and newbie mistake

CodePudding user response:

The way that you're locating the index is wrong, you should define a label name for your indexes for example let's assume that you have two columns colA and colB you should define a label for both columns for example label a for colA and label b for colB so, if you want to access to the colB you can access it by the following line of codes:

df.loc[(a, b), 'colB']

if you are getting a length mismatch error, it may be because the indexes you are specifying do not exist in the DataFrame, or because you are trying to select multiple columns and indexes which have different lengths. Make sure that the indexes you are specifying are correct and that you are only selecting columns with matching lengths.

Now, let's dive deeper into your question, if you want to select multiple indexes which have different lengths, you should probably get a length mismatch error, to avoid this happening you can either select your columns separately, or you can use boolean indexing to select the columns that have matching lengths, like the following line of codes:

matchesCol = df. columns[df.loc[(a, b)].notna().all()]
df.loc[(a, b), matchesCol]
  • Related