I want an efficient way of comparing columns of data to make a decision-CodePudding

I have a CSV file that I read using pandas. I would like to make a comparison between some of the columns and then use the outcome of the comparison to make a decision. An example of the data is shown below.

A	B	C	D
6	[5, 3, 4, 1]	-4.2974843	[-5.2324843, -5.2974843, -6.2074043, -6.6974803]
2	[3, 6,4, 7]	-6.4528433	[-6.2324843, -7.0974845, -7.2034041, -7.6974804]
3	[6, 2, 4, 5]	-3.5322451	[-4.3124440, -4.9073840, -5.2147042, -6.1904800]
1	[4, 3, 6,2]	-5.9752843	[-5.2324843, -5.2974843, -6.2074043, -6.6974803]
7	[2, 3, 4, 1]	-1.2974652	[-3.1232843, -4.2474643, -5.2074043, -6.1994802]
5	[1, 3, 7, 2]	-9.884843	[-8.0032843, -8.0974843, -9.2074043, -9.6904603]
4	[7, 3, 1, 4]	-2.3984843	[-7.2324843, -8.2094845, -9.2044013, -9.7914001]

Here is the code I am using:

    n_A = data['A']
    n_B = data['B']
    n_C = data['C']
    n_D = data['D']

    result_compare = []
    for w, e in enumerate(n_A):
        for ro, ver in enumerate(n_B):      
            for row, m in enumerate(n_C):
                for r, t in enumerate(n_D):
                    if ro==w:
                        if r ==row:
                            if row==ro:
                                if r==0:
                                    if t[r]>m:
                                        b = ver[r]
                                        result_compare.append(b)
                                    else:
                                        b = e
                                        result_compare.append(b)

                                elif r>=0:
                                    q = r-r
                                    if t[q]>m:
                                        b = ver[q]
                                        result_compare.append(b)
                                    else:
                                        b = e
                                        result_compare.append(b)

I had to select only the columns required for the comparison and that was why I did the following.

n_A = data['A']
n_B = data['B']
n_C = data['C']
n_D = data['D']

Results could be as:

result_compare = [6, 3 , 3,  4, 7 , 1, 4 ]

The values in D are arranged in descending order which is why the first element of the list is selected in this case. So when the first element in the row of the list D is greater than the one of C, we choose the first element of the list B, otherwise A. I would like an efficient way since my code takes lots of time to provide results most especially in the case of large data.

CodePudding user response：

I would do this in your case

data['newRow']=data.apply(lambda row: row["B"][0] if row["D"][0] > row["C"] else row['A'], axis=1)

And if you need it as a list by the end:

list(data['newRow'])