Home > Back-end >  How to concatenate by rows and save the entire iteration of a for loop | Pandas
How to concatenate by rows and save the entire iteration of a for loop | Pandas

Time:07-04

I"m having trouble concatenating my results into a pandas dataframe. So far, I'm getting the results into the correct place, but it only prints the last iteration of the for loop. Each column should be the ticker symbol (500 Tickers), and each row should be the beta, alpha, and standard error (Total 3 Rows).

Here is the code

def regression_stats():

    total_df = pd.DataFrame()

    for t in tickers:
        model = smf.ols(f'{t} ~ SP50', data=df1).fit()
        ticker = (f'{t}')
        beta = model.params['SP50']
        alpha = model.params['Intercept']
        std_error = model.bse['Intercept']

        resBeta = pd.DataFrame({f'{t}': beta },index=[0])
        resAlpha = pd.DataFrame({f'{t}': alpha},index=[0])
        resStderr = pd.DataFrame({f'{t}': std_error},index=[0])

        df_con = pd.concat([resBeta, resAlpha, resStderr],axis=0)
    total_df = pd.concat([total_df, df_con],axis=0)

    # total_df.to_csv('source/data/sp_regression.csv')
    print(total_df)

Here is the output

           SP50
0  1.000000e 00
0 -8.673617e-19
0  9.805668e-20

CodePudding user response:

Your final total_df is not indented. Change your code to this:

def regression_stats():

total_df = pd.DataFrame()

for t in tickers:
    model = smf.ols(f'{t} ~ SP50', data=df1).fit()
    ticker = (f'{t}')
    beta = model.params['SP50']
    alpha = model.params['Intercept']
    std_error = model.bse['Intercept']

    resBeta = pd.DataFrame({f'{t}': beta },index=[0])
    resAlpha = pd.DataFrame({f'{t}': alpha},index=[0])
    resStderr = pd.DataFrame({f'{t}': std_error},index=[0])

    df_con = pd.concat([resBeta, resAlpha, resStderr],axis=0)
    total_df = pd.concat([total_df, df_con],axis=0)

# total_df.to_csv('source/data/sp_regression.csv')
print(total_df)

What is indented here belongs to the for loop. Because in your code it was not indented, the concatenation with total_df was not included in the for loop, so it only changes using the values from the last iteration of the loop.

  • Related