Loop through parameter dictionary and create dataframe values-CodePudding

Is there a faster/better way to set parameters and then use them to set values in a dataframe? See basic code example below.

  df_loop = pd.DataFrame(columns=['streak', 'bet', 'runs'], index=np.arange(1,100,1))

params = {
        'streak_game' : [3,4,5,6,7],
        'initial_bet' : [50, 100, 150, 200, 250],
        'run_diff_abs' : [x for x in range(150, 40,-10)]
         }

for i in params['streak_game']:
    streak_game = i
    for j in params['initial_bet']:
        initial_bet = j
        for k in params['run_diff_abs']:
            run_diff = k

           # actual code is more complex, but I am setting a bunch of values similar to below
            for idx, row in df_loop.iterrows():
                df_loop.loc[idx, 'streak'] = i
                df_loop.loc[idx, 'bet'] = j
                df_loop.loc[idx, 'runs'] = k

Actual dataframe is about 4,600 rows. But I plan on creating larger data sets to test my logic.

CodePudding user response：

You can try

import itertools
out = df_loop.join(pd.DataFrame(itertools.product(*params.values()),columns= params.keys()))

CodePudding user response：

I'm not sure what is the reason you initialize the dataframe with np.nan. However, in your code, you should first initialize the parameters in a new variable before assigning it to the dataframe.

my_param = [(i,j,k) for i in params['streak_game'] for j in params['initial_bet'] for k in params['run_diff_abs']]

Next, instead of using iterrows, you can just use iloc to do so.

for i in range(len(df_loop)):
    df_loop.iloc[i] = my_param[i]

Too many nested loops will just cause redundance in your code.

CodePudding user response：

This should work.

df_loop = pd.DataFrame(columns=['streak', 'bet', 'runs'])

params = {
        'streak_game' : [3,4,5,6,7],
        'initial_bet' : [50, 100, 150, 200, 250],
        'run_diff_abs' : [x for x in range(150, 40,-10)]
         }
list = []
count = 0
for i in params['streak_game']:
    streak_game = i
    for j in params['initial_bet']:
        initial_bet = j
        for k in params['run_diff_abs']:
            run_diff = k
            df2 = pd.DataFrame(data=[[i,j,k]],columns=['streak', 'bet', 'runs'])
            df_loop = pd.concat([df_loop,df2])