Home > Back-end >  Average list based on 'n' lists, all of them with the same length, with Python
Average list based on 'n' lists, all of them with the same length, with Python

Time:01-26

I'm converting sentences to embeddings/indexes generated through OpenAI 'embeddings' endpoint.

E.G. I'm sending 'n' sentences ["sentenceA","sentenceB","sentenceC","sentenceD","sentenceE"]

and I'm getting as a response something like:

[
    [0.001542, 0.889456, 0.155421, 0.884747], // array for sentenceA
    [0.999956, 0.987778, 0.122222, 0.848484], // array for sentenceB
    [0.123456, 0.588847, 0.945125, 0.911111], // array for sentenceC
    (etc)
]

Each array having the very same length (in my use case, 1536 values each array).

I would need to convert that list of 'n' arrays into one, calculating the average of all the arrays (in the first element, the average of all arrays' first element, etc.); having as a result just one array with 1536 elements

Which would be the most easy/efficient way to do so with Python / Numpy?

Thank you in advance, and have a great day! :D

CodePudding user response:

You can use the numpy' function "mean", which returns the mean based on the axis you provide, in code it will be something like this:

#We import numpy and we declare the lists
import numpy as np

arrays=[[0.001542, 0.889456, 0.155421, 0.884747],
 [0.999956, 0.987778, 0.122222, 0.848484],
 [0.123456, 0.588847, 0.945125, 0.911111]]

#Then we use the mean function
mean_array=np.mean(arrays, axis = 0)

If we print "mean_array" we get this:

array([0.37498467, 0.822027  , 0.40758933, 0.88144733])

It is automatized, so if you increase the number of lists, you will still getting one single array.

  • Related