I have 3 dataframes of the sizes 1x100 , 20x100 and 20x100.
a1 b1 b2 .... b20 c1 c2 .... c20
0 1 0 .... 0 1 0 .... 0
1 0 1 .... 1 0 0 .... 0
0 0 0 .... 0 1 1 .... 1
1 0 0 .... 0 0 0 .... 1
0 1 1 .... 1 1 0 .... 0
1 0 0 .... 1 1 0 .... 0
I want to run a logical operation as ((a1==0) & (b1==1) & (c1==1)).astype(int)
then ((a1==0) & (b2==1) & (c2==1)).astype(int)
and so on until ((a1==0) & (b20==1) & (c20==1)).astype(int)
and store in a new dataframe.
The final output dataframe should consist of 20 columns.
CodePudding user response:
Because there are different columns names is necessary convert DataFrames to numpy arrays:
df = (df2.eq(1) & df3.eq(1).to_numpy() & df1.eq(0).to_numpy()).astype(int)
print (df)
b1 b2 b20
0 1 0 0
1 0 0 0
2 0 0 0
3 0 0 0
4 1 0 0
5 0 0 0
CodePudding user response:
The code below can give you the indices that satisfy the conditions for all three dataframes for each iteration in columns,
[{f"{i}": list(filter(lambda item:(a["a1"][item] == 1) & (b[f"b{i}"][item] == 1) & (c[f"c{i}"][item] == 1), range(100)))} for i in range(1, 21)]
Subset of the output -
[{'1': [1, 9, 14, 18, 23, 24, 28, 33, 45, 51, 53, 77, 88, 89, 90]},
{'2': [5, 17, 27, 32, 44, 56, 73, 76, 79, 88, 89, 92]},
{'3': [9, 14, 24, 25, 43, 46, 53, 55, 73, 76, 86, 91, 92, 94, 96]},
{'4': [17, 20, 22, 24, 26, 34, 46, 65, 73, 76, 77, 81, 88, 92]},
{'5': [9, 17, 23, 34, 36, 44, 48, 49, 75, 76, 88, 91, 94, 95]}]
I had fun solving this question. Thanks for asking it!