I have a written a python function that takes a data frame as one of the arguments. Below is the simplified version of the function:
def cat_update(df_to_update, df_source, cat_lst, con_lst):
try:
for cat, con in itertools.product(cat_lst, con_lst):
df_to_update.at[cat, con] = df_source.at[cat, con]
Below is how I am calling this function:
cat_update(df_templete1, raw_source, cat_lst, con_lst)
Now, I need to scale my code where there can be multiple source data frames (raw_source
)
How do I specify a variable here so that instead of specifying the actual data frame value I can change it as per the requirement?
I tried specifying assigning the value of the variable as follows:
raw_source = 'df_source_1'
But in this case, it goes as a string and not as a data frame hence the function is not able to evaluate it as per expectations. In short, I need to change it from str
to pandas.core.frame.DataFrame
More information: I call the above function inside a for loop:
for n in range(len(df_config)):
cat_lst = df_config.at[n,'category'].split(",")
con_lst = df_config.at[n,'country'].split(",")
raw_source = df_config.at[n,'Raw source']
energy_source = df_config.at[n,'Energy source']
Hence the source data frame is picked up automatically from user input which is saved in the df_config
.
CodePudding user response:
Create a dictionary like this: {"data_frame_name" : data_frame}
, so that you can access each data_frame by it's name, and assume we have a data_src_1
data, like below:
data_src_1 = [['Alex',10],['Bob',12],['Clarke',13]]
df_source_1 = pd.DataFrame(data_src_1)
raw_sources = {"df_source_1" : df_source_1} # You can have other dataframes here
Pass the name of data frame you want df_source
to the cat_update
method, and edit the method like this:
raw_sources = {"df_source_1" : df_source_1, ...}
def cat_update(df_to_update, df_source, cat_lst, con_lst):
try:
for cat, con in itertools.product(cat_lst, con_lst):
df_to_update.at[cat, con] = raw_sources[df_source].at[cat, con]
However, you could just pass the data frame such as df_source_1
it self to the method, but in the above snippet, you can have all data frames altogether in one dictionary (raw_sources
).