Home > Software engineering >  I want an efficient way of comparing columns of data to make a decision
I want an efficient way of comparing columns of data to make a decision

Time:05-03

I have a CSV file that I read using pandas. I would like to make a comparison between some of the columns and then use the outcome of the comparison to make a decision. An example of the data is shown below.

A B C D
6 [5, 3, 4, 1] -4.2974843 [-5.2324843, -5.2974843, -6.2074043, -6.6974803]
2 [3, 6,4, 7] -6.4528433 [-6.2324843, -7.0974845, -7.2034041, -7.6974804]
3 [6, 2, 4, 5] -3.5322451 [-4.3124440, -4.9073840, -5.2147042, -6.1904800]
1 [4, 3, 6,2] -5.9752843 [-5.2324843, -5.2974843, -6.2074043, -6.6974803]
7 [2, 3, 4, 1] -1.2974652 [-3.1232843, -4.2474643, -5.2074043, -6.1994802]
5 [1, 3, 7, 2] -9.884843 [-8.0032843, -8.0974843, -9.2074043, -9.6904603]
4 [7, 3, 1, 4] -2.3984843 [-7.2324843, -8.2094845, -9.2044013, -9.7914001]

Here is the code I am using:

    n_A = data['A']
    n_B = data['B']
    n_C = data['C']
    n_D = data['D']

    result_compare = []
    for w, e in enumerate(n_A):
        for ro, ver in enumerate(n_B):      
            for row, m in enumerate(n_C):
                for r, t in enumerate(n_D):
                    if ro==w:
                        if r ==row:
                            if row==ro:
                                if r==0:
                                    if t[r]>m:
                                        b = ver[r]
                                        result_compare.append(b)
                                    else:
                                        b = e
                                        result_compare.append(b)

                                elif r>=0:
                                    q = r-r
                                    if t[q]>m:
                                        b = ver[q]
                                        result_compare.append(b)
                                    else:
                                        b = e
                                        result_compare.append(b)

I had to select only the columns required for the comparison and that was why I did the following.

n_A = data['A']
n_B = data['B']
n_C = data['C']
n_D = data['D']

Results could be as:

result_compare = [6, 3 , 3,  4, 7 , 1, 4 ]

The values in D are arranged in descending order which is why the first element of the list is selected in this case. So when the first element in the row of the list D is greater than the one of C, we choose the first element of the list B, otherwise A. I would like an efficient way since my code takes lots of time to provide results most especially in the case of large data.

CodePudding user response:

I would do this in your case

data['newRow']=data.apply(lambda row: row["B"][0] if row["D"][0] > row["C"] else row['A'], axis=1)

And if you need it as a list by the end:

list(data['newRow'])
  • Related