python: creating new comparison columns based on column references in seperate datatable-CodePudding

I have a dataset with columns a_x,b_x,c_x,d_x, a_z,b_z,c_z,d_z

df=pd.DataFrame({'a_x':['a','b','c'],'b_x':['a','b','c'] ,'c_x':['a','b','c'],'d_x':['a','b','c'],'a_z':['a','b','i'],'b_z':['a','t','c'] ,'c_z':['c','c','c'],'d_z':['a','b','c']})

I have another dataset with columns : original,_x,_z.

header_comp=pd.DataFrame({'original':['a','b','c','d'],'_x':['a_x','b_x','c_x','d_x'],'_z':['a_z','b_z','c_z','d_z']})

I'm trying to create a loop using the header_comp to compare the _x columns to the corresponding _z columns such that new columns are created in the original df dataset: a_comp, b_comp, c_comp, d_comp.

Each of these columns will compare if i_x is equal to i_z and spit out either 1 or 0.

output should therefore look like this:

df=pd.DataFrame({'a_x':['a','b','c'],'b_x':['a','b','c'] ,'c_x':['a','b','c'],'d_x':['a','b','c'],'a_z':['a','b','i'],'b_z':['a','t','c'] ,'c_z':['c','c','c'],'d_z':['a','b','c'],'a_comp':[1,1,0],'b_comp':[1,0,1] ,'c_comp':[0,0,1],'d_comp':[1,1,1]})

So far, my code looks like this

for i in range(0, len(header_match)):
    df[header_matrch.iloc[i,0]   ' comp'] = (df[header_match.iloc[i,1]==df[header_match.iloc[i,2]]).astype(int)

however, this is not working, with an error of 'Pivotrelease_x'. Is anyone able to troubleshoot this for me?

If I just use the code for individual columns outside of the for loop, there are no problems. e.g.

df[header_matrch.iloc[1,0]   ' comp'] = (df[header_match.iloc[1,1]==df[header_match.iloc[1,2]]).astype(int)

Thanks.

CodePudding user response：

You can just use the values in header_comp to index the values in df:

df[header_comp['original']   '_comp'] = (df[header_comp['_x']].to_numpy() == df[header_comp['_z']]).astype(int)

Output:

>>> df
  a_x b_x c_x d_x a_z b_z c_z d_z  a_comp  b_comp  c_comp  d_comp
0   a   a   a   a   a   a   c   a       1       1       0       1
1   b   b   b   b   b   t   c   b       1       0       0       1
2   c   c   c   c   i   c   c   c       0       1       1       1