Home > Software engineering >  Dropping rows of dataframe in a for-loop in Python
Dropping rows of dataframe in a for-loop in Python

Time:12-04

I have multiple dataframes with multiple columns as this:

DF = 
    A    B    C   metadata_Colunm
r1  6    3    9   r1 
r2  2    1    1   r2
r3  5    7    2   r3

How can I use a for-loop to iterate over each column to make a new dataframe and then remove rows where values are below 5 for each new dataframe? The result should look like this:

DF_A=
      A   metadata_Colunm
      6   r1
      5   r1

DF_B=
      B   metadata_Colunm
      7   r3

DF_C=
      C   metadata_Colunm
      9   r1

What I have done so far is to make a list over the columns I will use (all excluding metadata) and then go trough the columns as new dataframes. Since I also need to preserve the metadata I add the metadata-column as part of the new dataframe:

DF = DF.drop("metadata_Colunm")
ColList = list(DF)
for item in ColList:
    locals()[f"DF_{str(item)}"] = DF[[item, "metadata_Colunm"]]
    locals()[f"DF_{str(item)}"] = locals()[f"DF_{str(item)}"].drop(locals()[f"DF_{str(item)}"][locals()[f"DF_{str(item)}"].item > 0.5].index, inplace=True)
     

But using this I get "AttributeError: 'DataFrame' object has no attribute 'item'.

Any suggestions for making this work, or any other solutions, would be greatly appreciated!

Thanks in advance!

CodePudding user response:

you can apply a filter to the dataframe(s) instead of using a loop

def filter(threshold=5, df):
    for column in df.columns:
        df = df[df[column]>=5]

Then apply the filer to all your dataframes:

dfs = [df1, df2, df3...]
for df in dfs:
    filter(df)

CodePudding user response:

dfs = {}
for col in df.columns[:-1]:
    df_new = df[[col, 'metadata_Colunm']]
    dfs[col] = df_new[df_new[col] >= 5]
  • Related