Home > Back-end >  Pandas- Fill a dictionary with dataframes depending on a switch
Pandas- Fill a dictionary with dataframes depending on a switch

Time:11-24

Background: I have a few dataframes that may be turned on or off with switches. I want to fill a dictionary with each of the turned-on dataframes. Then I want to be able to loop over the dataframe.

Issue: I don't know how to dynamically build my dictionary to only include the dataframes when their switches are turned on.

What I've Tried:

import pandas as pd

sw_a = True
sw_b = False
sw_c = True

a = pd.DataFrame({'IDs':[1234,5346,1234,8793,8793],
                   'Cost':[1.1,1.2,1.3,1.4,1.5],
                    'Names':['APPLE','Orange','STRAWBERRY','Grape','Blue']}) if sw_a == True else []
b = pd.DataFrame({'IDs':[1,2],
                   'Cost':[1.1,1.2],
                    'Names':['APPLE1','Blue1']}) if sw_b == True else []
c = pd.DataFrame({'IDs':[12],
                  'Cost':[1.5],
                    'Names':['APPLE2']}) if sw_c == True else []
total = {"first":a,"second":b,"third":c}

for df in total:
    temp_cost = sum(total[df]['Cost'])
    print(f'The number of fruits for {df} is {len(total[df])} and the cost is {temp_cost}')

The above does not work because it always includes the dataframes, if the switch is off it's a string instead of totally excluded.

CodePudding user response:

Consider something like this.

sw_a = True
sw_b = False
sw_c = True

a = pd.DataFrame({'IDs':[1234,5346,1234,8793,8793],
                   'Cost':[1.1,1.2,1.3,1.4,1.5],
                    'Names':['APPLE','Orange','STRAWBERRY','Grape','Blue']})
b = pd.DataFrame({'IDs':[1,2],
                   'Cost':[1.1,1.2],
                    'Names':['APPLE1','Blue1']})
c = pd.DataFrame({'IDs':[12],
                  'Cost':[1.5],
                    'Names':['APPLE2']})

total = {}
if sw_a == True:
    total['sw_a'] = a
if sw_b == True:
    total['sw_b'] = b
if sw_c == True:
    total['sw_c'] = c
print(total)

for df in total:
    temp_cost = sum(total[df]['Cost'])
    print(f'The number of fruits for {df} is {len(total[df])} and the cost is {temp_cost}')

The number of fruits for sw_a is 5 and the cost is 6.5
The number of fruits for sw_c is 1 and the cost is 1.5

CodePudding user response:

My set-up is similar to yours, but I don't bother with the switches on each dataframe assignment:

import pandas as pd

sw_a = True

sw_b = False
sw_c = True

a = pd.DataFrame({'IDs':[1234,5346,1234,8793,8793],
                   'Cost':[1.1,1.2,1.3,1.4,1.5],
                    'Names':['APPLE','Orange','STRAWBERRY','Grape','Blue']})
b = pd.DataFrame({'IDs':[1,2],
                   'Cost':[1.1,1.2],
                    'Names':['APPLE1','Blue1']})
c = pd.DataFrame({'IDs':[12],
                  'Cost':[1.5],
                    'Names':['APPLE2']})

total = {"first":a,"second":b,"third":c} # don't worry about the switches yet.

Only now do we filter:

list_switches = [sw_a, sw_b, sw_c] # the switches! finally!
total_filtered = {tup[1]:total[tup[1]] for tup in zip(list_switches, total) if tup[0]}

And carry on as you have done.

for df in total_filtered:
    temp_cost = sum(total[df]['Cost'])
    print(f'The number of fruits for {df} is {len(total[df])} and the cost is {temp_cost}')

Output:

enter image description here

Edit You can be slightly fancier with the zip functionlity, for example if you're constructing the lists of dataframes, dataframe names, and switches dynamically and can be sure that they will always be the same length, you could do something like:

# pretend these three lists are coming from somewhere else and can have variable length, rather than being hard-coded.
list_dfs = [a,b,c]
list_switches = [sw_a, sw_b, sw_c]
list_names = ["first", "second", "third"]

# use a zip object over the three lists.
zipped = zip(list_dfs, list_switches, list_names)
total = {tup[2] : tup[0] for tup in zipped if tup[1]}

for df in total:
    temp_cost = sum(total[df]['Cost'])
    print(f'The number of fruits for {df} is {len(total[df])} and the cost is {temp_cost}')
  • Related