Home > database >  Within a for loop how to append index value to the end of the dataframe name
Within a for loop how to append index value to the end of the dataframe name

Time:03-25

I want writing a 'for' loop using Python script (pandas dataframe), and would like to append the index value to the end of dataframe name to differentiate each of them, how can I do it?

For example, I have a dataframe df with column value to be 1~5; and would like to split the dataset into 5 pieces, each has value to be '1' / '2'/ '3'/ '4'/ '5'.

I've tried the following which seems to have syntax error. How can I change it? thanks

for i in range(1, 5):
   df_f'{i}' = df.loc[df['value'] == i]

Note: I'd like the desired dataframe name to be df_1. df_2, df_3, df_4, df_5

CodePudding user response:

As @hbgoddard pointed out, it's bad practice to generate variable names dynamically (at least in production code). However, if you really want to do it, edit globals() like so:

for i in range(1, 5):
   globals()[f'df_{i}'] = df.loc[df['value'] == i]

CodePudding user response:

It is not recommended to generate variable names at runtime; use a list or dictionary instead.

df_parts = {i: df.loc[df['value'] == i] for i in range(1, 6)}

A more concise version that handles all unique values of value instead of just 1 through 5:

df_parts = dict(list(df.groupby('value')))

You can then access each part as df_parts[1], df_parts[2], etc.

  • Related