I'm running a for loop using pandas that checks if another DataFrame with same name has been created. If it has been created, then just append the values to the correspondent columns. If it has not been created, then create the df
and append the values to the named columns.
dflistglobal = []
####
For loop that generate a, b, and c variables every time it runs.
####
###
The following code runs inside the for loop, so that everytime it runs, it should generate a, b, and c, then check if a df has been created with a specific name, if yes, it should append the values to that "listname". If not, it should create a new list with "listname". List name changes everytime I run the code, and it varies but can be repeated during this for loop.
###
if listname not in dflistglobal:
dflistglobal.append(listname)
listname = pd.DataFrame(columns=['a', 'b', 'c'])
listname = listname.append({'a':a, 'b':b, 'c':c}, ignore_index=True)
else:
listname = listname.append({'a':a, 'b':b, 'c':c}, ignore_index=True)
I am getting the following error:
File "test.py", line 150, in <module>
functiontest(image, results, list)
File "test.py", line 68, in funtiontest
listname = listname.append({'a':a, 'b':b, 'c':c}, ignore_index=True)
AttributeError: 'str' object has no attribute 'append'
The initial if statement runs fine, but the else statement causes problems.
CodePudding user response:
Solved this issue by not using pandas dataframes. I looped thru the for loop generating a unique identifier for each listname
, then appended a,b,c,listname
on a list. At the end you will end up with a large df
that can be filtered using the groupby
function.
Not sure if this will be helpful for anyone, but avoid creating pandas dfs
and using list is the best approach.
CodePudding user response:
That error tells you that listname
is a string (and you cannot append a DataFrame to a string).
You may want to check if somewhere in your code you are adding a string to your list dflistglobal
.
EDIT: Possible solution
I'm not sure how you are naming your DataFrames, and I don't see how you can access them afterwards.
Instead of using a list, you can store your DataFrames inside a dictionary dict = {"name": df}
. This will let you easily access the DataFrames by name.
import pandas as pd
import random
df_dict = {}
# For loop
for _ in range(10):
# Logic to get your variables (example)
a = random.randint(1, 10)
b = random.randint(1, 10)
c = random.randint(1, 10)
# # Logic to get your DataFrame name (example)
df_name = f"dataframe{random.randint(1,10)}"
if df_name not in df_dict.keys():
# DataFrame with same name does not exist
new_df = pd.DataFrame(columns=['a', 'b', 'c'])
new_df = new_df.append({'a':a, 'b':b, 'c':c}, ignore_index=True)
df_dict[df_name] = new_df
else:
# DataFrame with same name exists
updated_df = df_dict[df_name].append({'a':a, 'b':b, 'c':c}, ignore_index=True)
df_dict[df_name] = updated_df
Also, for more info, you may want to visit this question
I hope it was clear and it helps.