I have a set of functions that run as they iterate through each column of a dataframe and an output is generated with respect to each column each time the function runs. I am trying to figure out a way to store the output into a newly initialized pandas dataframe as they are generated after the function calls. The output generated with respect to each column have different values but the same length (e.g. Index 0 to 3 i.e 4 rows).
(So, essentially what happens in iteration is a column is selected from the original dataframe, goes through a function, function generates an output, and I want to keep appending the outputs in columns into a new df). When I initialize a empty dataframe before the for loop and then add the column to dataframe using df.assign, the code doesn't work. Can someone please help?
The example of output:
index
0 24
1 59
2 43.7
3 9.8
The mainline of the code structure looks like :
def main():
df_new = pd.Dataframe() #initializing empty dataframe
for col in df.columns:
col_df = df_full[[col]]
col_df.reset_index(inplace=True)
#calling function 1 that produces the output()
#Lets say the output is stored in variable 'value_generated'
df.assign(value_generated)
Expected new df ( with dummy values).
index Col1 Col2 Col3 Col4
0 24 21 20 24.8
1 59 50 61.1 60.3
2 43.7 4 48 49
CodePudding user response:
This code iterates through columns of pre_existing_df
. You would need to replace give_output()
with your needed functions. Right now it adds a column to your new_df
with the same column name like pre_existing_df
has and fills it with [1,2,3]
def give_output():
return [1,2,3]
def main():
df_new = pd.Dataframe()
for col in pre_existing_df.columns:
df_new[col] = give_output()