Home > Software engineering >  Using a list to generate dataframes in for loop
Using a list to generate dataframes in for loop

Time:09-23

I have four dataframes df_may, df_jun, df_jul, df_aug. Although I can write them to individual csv files manually with 4 lines of code, but I want to to this in a for loop.

This is what I tried, with a SytnaxError

months = ['may','jun','jul','aug']
for i in months:
       df_{}.format(i).to_csv('raw_master_{}'.format(i))

Also, can I extract the list 'months' somehow using existing dataframes in my notebook?

CodePudding user response:

I would use a dictionary to associate the variable with the month:

month_dfs = {'may': df_may, 'jun': df_jun, 'jul': df_jul, 'aug': df_aug}
for month, df in month_dfs.items():
    df.to_csv(f'raw_master_{month}')

If you REALLY want to do this with matching the variable name, then you can use the locals function, but I don't recommend this. The code is really brittle and hard to understand and code analyzers/ linters won't be able to catch errors for you.

for month in months:
    locals()[f'df_{month}'].to_csv(f'raw_master_{month}')

CodePudding user response:

Also, can I extract the list 'months' somehow using existing dataframes in my notebook?

You will need to specify what notebook you are referring to here, or what the data contains, otherwise it's hard to answer. If you can't get the month string directly from your DataFrames you can use zip().

for month, data in zip(['may', 'jun', 'jul', 'aug'], [df_may, df_jun, df_jul, df_aug]):
       data.to_csv(f'raw_master_{month}')

CodePudding user response:

You cannot dinamically change a variable like you were trying to do with the df_{}. You can put all the dataframes in a list and iterate over it zipping it together with the list containing the months.

months = ['may','jun','jul','aug']
df_list = [df_may, df_jun,df_jul,df_aug]
for df, month in zip(df_list, months) :
       df.to_csv(f'raw_master_{month}')

If you want to use the variables in a dynamic way, you could eventually use globals() [f"df_{month}" ] or locals() or eval(f"df_{month}") in your code in place of df_{} Have fun coding!

  • Related