I have a CSV file that I read using pandas. I would like to make a comparison between some of the columns and then use the outcome of the comparison to make a decision. An example of the data is shown below.
A | B | C | D |
---|---|---|---|
6 | [5, 3, 4, 1] | -4.2974843 | [-5.2324843, -5.2974843, -6.2074043, -6.6974803] |
2 | [3, 6,4, 7] | -6.4528433 | [-6.2324843, -7.0974845, -7.2034041, -7.6974804] |
3 | [6, 2, 4, 5] | -3.5322451 | [-4.3124440, -4.9073840, -5.2147042, -6.1904800] |
1 | [4, 3, 6,2] | -5.9752843 | [-5.2324843, -5.2974843, -6.2074043, -6.6974803] |
7 | [2, 3, 4, 1] | -1.2974652 | [-3.1232843, -4.2474643, -5.2074043, -6.1994802] |
5 | [1, 3, 7, 2] | -9.884843 | [-8.0032843, -8.0974843, -9.2074043, -9.6904603] |
4 | [7, 3, 1, 4] | -2.3984843 | [-7.2324843, -8.2094845, -9.2044013, -9.7914001] |
Here is the code I am using:
n_A = data['A']
n_B = data['B']
n_C = data['C']
n_D = data['D']
result_compare = []
for w, e in enumerate(n_A):
for ro, ver in enumerate(n_B):
for row, m in enumerate(n_C):
for r, t in enumerate(n_D):
if ro==w:
if r ==row:
if row==ro:
if r==0:
if t[r]>m:
b = ver[r]
result_compare.append(b)
else:
b = e
result_compare.append(b)
elif r>=0:
q = r-r
if t[q]>m:
b = ver[q]
result_compare.append(b)
else:
b = e
result_compare.append(b)
I had to select only the columns required for the comparison and that was why I did the following.
n_A = data['A']
n_B = data['B']
n_C = data['C']
n_D = data['D']
Results could be as:
result_compare = [6, 3 , 3, 4, 7 , 1, 4 ]
The values in D are arranged in descending order which is why the first element of the list is selected in this case. So when the first element in the row of the list D is greater than the one of C, we choose the first element of the list B, otherwise A. I would like an efficient way since my code takes lots of time to provide results most especially in the case of large data.
CodePudding user response:
I would do this in your case
data['newRow']=data.apply(lambda row: row["B"][0] if row["D"][0] > row["C"] else row['A'], axis=1)
And if you need it as a list by the end:
list(data['newRow'])