I am trying to set part of two dataframes (static_df_1 and static_df_2) of same size (1000000 rows and 8 columns) equal to each other based on 4 conditions. However, I am unable to make them equal. i and j are two columns in each dataframe and sales are also another shared column in those two dataframes. My conditions are to set equal only those part of my two dataframe where 25 < i < 36 and 25 < j < 36. When I perform the below code, they are still different and not equal!
old_sales = static_df_1.loc[(static_df_1['i'] > 25 ) & (static_df_1['i'] < 36) & (static_df_1['j'] > 25 ) & (static_df_1['j'] < 36 )]['sales']
static_df_2.loc[(static_df_2['i'] > 25 ) & (static_df_2['i'] < 36) & (static_df_2['j'] > 25 ) & (static_df_2['j'] < 36 )]['sales'] = old_sales
CodePudding user response:
Typically you would index with
df.loc[row_indexer,column_indexer]
Maybe separate things out to make it easier to evaluate.
row_indexer = (static_df_1['i'] > 25 ) & (static_df_1['i'] < 36) & (static_df_1['j'] > 25 ) & (static_df_1['j'] < 36 )
old = static_df_1.loc[row_indexer,'sales']
static_df_2.loc[row_indexer,'porosity'] = old
I don't have Pandas installed here so I cannot test.
From Boolean indexing in the Pandas User Guide (emphasis mine):
With the choice methods Selection by Label, Selection by Position, and Advanced Indexing you may select along more than one axis using boolean vectors combined with other indexing expressions.
Also from Different choices for indexing:
.loc is primarily label based, but may also be used with a boolean array