Sum two array if they have the same index, but keep the indexes afterward?-CodePudding

lets say we have to arrays

first:

arr1=
    [1000      5.0
     1270      5.0
     1315      5.0
    ]

arr2=
        [578      5.0
         1000      5.0
         1315      5.0
        ]

So as you can see we have indexes and their values what I want to do is concat them in a way if they have the same index then add them if not just put them with their values.

Final output:

FINAL_ARR=
            [578      5.0
             1000      10.0
             1270      5.0
             1315      10.0
            ]

I've tried to do the zip() method but I need to keep the indexes

note: they dont need to be sorted.

CodePudding user response：

The use of a dict seems to be the way to go here:

map1={
  1000: 5.0,
  1270: 5.0,
  1315: 5.0
}

map2={
578: 5.0,
1000: 5.0,
1315: 5.0
}

r = map1
for k, v in map2.items():
    if k not in r:
        r[k] = 0
    r[k]  = v

CodePudding user response：

Those are not arrays. They aren't even python.

In python, arrays are sequential, with index 0,1,2,3,...

The closest thing to what you describe are dictionaries. Whose keys are anything (or almost so), associated to values. So there, you could decide to create a new dict whose keys are a merge of keys of both sources. And values the sum of 2 values if there are two, or the unique value if there is only one.

dic1={1000: 5.0, 1270: 5.0, 1315: 5.0}
dic2={578: 5.0, 1000: 5.0, 1315: 5.0}
res={k:(dic1[k] if k in dic1 else 0)   (dic2[k] if k in dic2 else 0) for k in set(dic1).union(dic2)}

Some timing indications (disappointing from my point of view. Most of the time, compound are not only one-liner, but also faster than explicit construction. But not here)

My solution: 2.15
Variant of my solution, with the .get suggested by Mozway to Rahul, but that also applies to my solution: 2.29 (personaly, I would pay those 0.14 to avoid the harder to read "functional if".
Rahul's solution: 2.78
imburningbabe's solution: 4439.33 (not surprising. Pandas is heavy machinery, way too heavy for such tasks. For way bigger dictionaries the difference would certainly reduce a lot, and it might even win for big enough data — because it avoids a python for loop, that no other of us avoid). For inputs of 1 millions entries, ratio between best solution and pandas one is only 2 (when it's over 2000 for example input). So, still the slowest. But pandas could win eventually with even more data.

But all that is beaten by Julien's direct solution (even tho I count in the timing, for his solution, a copy of the first dictionary. In order to compare comparable algorithm: all of our solution returns a 3rd dictionary, letting the 2 input dictionaries intact. While his alter one of the input dictionary. But, making it work on a copy make us play with the same rule

Julien's solution : 1.81

So, for now, if you want the fastest solution, Julien's is what you want. If you want a one-liner, mine (without the elegant .get suggested by mozway) is the fastest (among one-liner).

In a comment to another solution, you said that you also wanted only intersection (so that is another question). Then, you could

{k: dic1[k] dic2[k] for k in set(dic1).intersection(dic2)}

CodePudding user response：

If you consider both are dictionaries,

dic1={1000: 5.0, 1270: 5.0, 1315: 5.0}
dic2={578: 5.0, 1000: 5.0, 1315: 5.0}
output = {k: dic1.get(k, 0)  dic2.get(k,0) for k in list(dic1.keys()) list(dic2.keys())}
# {1000: 10.0, 1270: 5.0, 1315: 10.0, 578: 5.0}

If both are arrays

arr1 = [[1000,5.0], [1270, 5.0], [1315, 5.0]]
arr2 = [[578,5.0], [1000, 5.0], [1315, 5.0]]

d = {}
for i in arr1   arr2:
    d[i[0]] = d.get(i[0], 0)   i[1]
# {1000: 10.0, 1270: 5.0, 1315: 10.0, 578: 5.0}

CodePudding user response：

Interesting one line solution:

res = pd.DataFrame.from_records(arr1).set_index(0).append(pd.DataFrame.from_records(arr2).set_index(0)).sum(level=0).reset_index().values