Home > database >  Dynamically name items in a list
Dynamically name items in a list

Time:09-03

I have this df called feature_df:

      Coupon    CprTarget   bondsec_code
0      3.0      9.908900    2
1      3.5      9.172600    1
2      4.0      9.993500    1
3      3.5      8.985600    4
4      4.0      12.190200   3
... ... ... ...
20707   1.5    5.559933    2
20708   4.0    12.866399   3
20709   5.0    17.982506   1
20710   3.5    12.098302   3
20711   2.5    11.390324   2

I'd like to generate set's of new dataframes from this one that are grouped by Coupon and bondsec_code. So for example the Coupon 3.0 and bondsec_code 1 dataframe might look like:

      Coupon    CprTarget   bondsec_code
0      3.0      8.408900    1
1      3.0      8.172600    1
2      3.0      8.993500    1
3      3.0      8.985600    1
4      3.0      11.190200   1

I know that I could manually achieve this with something like: coup_3_bondsec_1 = feature_df[(feature_df['Coupon'] == 3.0) & (feature_df['bondsec_code'] == 1)] but I want something dynamic via a for loop that also allows me to name each dataframe appropriately, I'm just not sure how to create this.

For your reference when I run:

print(sorted(feature_df['Coupon'].unique()))
print(sorted(feature_df['bondsec_code'].unique()))

The output is:

[1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0]
[0, 1, 2, 3, 4]

So those are all the combinations of Coupon and bondsec_code that I could have.

I know it would looks something like this:

dfs = []
for coupon, code in set(feature_df['Coupon']), set(feature_df['bondsec_code']):
    cluster = feature_df[(feature_df['Coupon'] == coupon) & (feature_df['bondsec_code'] == code)]
    dfs.append(cluster)

But this is throwing an error: ValueError: too many values to unpack (expected 2)

EDIT I think I figured something out:

for coupon in set(feature_df['Coupon']):
    for code in set(feature_df['bondsec_code']):
        cluster = feature_df[(feature_df['Coupon'] == coupon) & (feature_df['bondsec_code'] == code)]
        dfs.append(cluster)

This works, but now I need a way to separate out each df in that list and name it dynamically, maybe with another for loop. Ideal names could just be bondsec_code_Coupon so like `1_3'

CodePudding user response:

The common way to "name variables dynamically" is to use a dict. Instead of trying to do the following dynamically:

var_name_for_abc = "abc"
var_name_for_xyz = "xyz"

A dict:

my_vars = {f'var_name_for_{val}': val for val in ['abc', 'xyz']}

# alternatively:
my_vars = {}
for val in ['abc', 'xyz']:
    my_vars[f'var_name_for_{val}'] = val

To convince you of this, in order to actually accomplish the first example, you'd use globals(), which represent the local variables... as a dict, and you'd update it in the same manner anyway.

  • Related