Home > Back-end >  Python calling a list by name
Python calling a list by name

Time:09-13

Consider the case where I have three dataframes, that are called df1, df2 and df3.

df1 has 5 columns: x1, x2, x3, age, height. df2 has 3 columns: x1, x2, weight. df3 has 4 columns: x1, x2, x3, bmi.

For each of these dataframes, I want to create a list of the demographic variables e.g.

df1_demographics=['age', 'height']
df2_demographics=['weight']
df3_demographics=['bmi']

I want to be able to call the list in a loop in the following way:

for dataset in df1, df2, df3:

   print(dataset_demographics)
   

My actual loop is very long and I need to loop through the dataframes. That's why I specifically want a way of calling the lists within a for loop looping through the dataframes.

The desired output of this loop would be

['age', 'height']
['weight']
['bmi']

CodePudding user response:

I'm not entirely sure what your question is asking. Are you looking to associate a subset of columns with each of your dataframes like so?

demographic_cols = [
    ['age', 'height'],
    ['weight'],
    ['bmi']
]
dataframes = [df1, df2, df3]

for dataset, demographic_cols in zip(dataframes, demographic_cols):
    print(dataset, demographic_cols)

CodePudding user response:

Try renaming your vars, if you do something similar to this you'll get your desired output.

df1=['age', 'height']
df2=['weight']
df3=['bmi']

for dataset in df1, df2, df3:
   print(dataset)

Output:

['age', 'height']
['weight']
['bmi']

CodePudding user response:

I think this can be done using zip

df1_demographics=['age', 'height']
df2_demographics=['weight']
df3_demographics=['bmi']
demographics = [df1_demographics, df2_demographics, df3_demographics]
dfs = [df1, df2, df3]

for df, demographics in zip(dfs, demographics):
    # do what ever you want to do
    # for example
    for val in demographics:
        print(df[val])

CodePudding user response:

I believe some more clarity is needed in the question. But here's what I understand. You wish to create a list of column names that you can extract from any of the three dataset that you have created. Here's how you can do that,

demographics = ['age','height','weight','bmi']

df1 = pd.DataFrame(np.random.randint(0,100,size=(5, 5)), columns=['x1','x2','x3','age','height'])

df2 = pd.DataFrame(np.random.randint(0,100,size=(5, 3)), columns=['x1','x2','weight'])

df3 = pd.DataFrame(np.random.randint(0,100,size=(5, 4)), columns=['x1','x2','x3','bmi'])

for dataset in df1, df2, df3:
    print(dataset.loc[:,dataset.columns.isin(demographics)])

The result would look like this

Hope this helps a bit.

  • Related