I have 238 dataframes that look like this:
Index | before | after |
---|---|---|
a,1 | 10 | 10 |
b,2 | 10 | 100 |
c,3 | 100 | 100 |
d,4 | 1000 | 100 |
I would like to have a for loop which would drop all of the rows where before and after values are the same (leave only those rows where they are different). Example
Index | before | after |
---|---|---|
b,2 | 10 | 100 |
d,4 | 1000 | 100 |
Right now, I just have 238 of these: onlydiffs_dfi = dfi[dfi['before'] != dfi['after']]
Which is obviously not great, and can be accomplished with a for loop but I but figure out how to write it. Please help!
CodePudding user response:
Make a list containing the dataframes and iterate:
df_list =[*list of dfs]
for df in df_list:
new_df = df[df['before'] != df['after']]
Then you can append it to a new list... or whatever you want to do with it If all your dfs are in a dictionary, you iterate as well just index into it:
df_dict = {key0:df0,key1:df1 ....}
for key,df in df_dict.items():
new_df = df[df['before'] != df['after']]
or even less pythonic:
for key in df_dict.keys():
df = df_dict[key]
new_df = df[df['before'] != df['after']]
You can even convert you dictionary values to a list and use the first method:
df_list = list(df_dict.values())