Home > other >  Call a function on whole array instead of on every element of it
Call a function on whole array instead of on every element of it

Time:07-19

Let's say I have a list of tuples in a form of:

 data = [(1,2), (1,2), ..., (1,2)]

and a method data_to_bytes that accepts tuple does something to it and returns bytes. Now I want to call this method on every element from data and save output to the file. This is something I already have:

def create_data_file(data_file, data):
    with data_file.open('wb') as _file:
        for i in data:
            _file.write(data_to_bytes(i))
    return data_file

But this is terribly slow and I would like to improve it. Maybe it is possible to get rid of the loop inside with. I was thinking about using numpy somehow and maybe call data_to_bytes() on whole array instead of every element. Is this possible somehow?

CodePudding user response:

Numpy is only fast if you can replace your "data_to_bytes" by numpy functions that run on the whole array. Numpy does not vectorize arbitrary functions.

If you don't think that's possible, other ways of increasing performance might be:

  • Caching results of "data_to_bytes" or
  • Improving the performance of "data_to_bytes" itself

CodePudding user response:

Take a look at numpy.apply_along_axis. This takes the input array and applies the given function along the axis returning a new array with the computed data.

CodePudding user response:

Try this code:

data = [(1, 2), (1, 2), (1, 2)]
def create_data_file(data_file, data):
    with open(data_file, 'wb') as _file:
        for i in data:
            _file.write(bytearray(i))
            print(type(bytearray(i)))
    return data_file
create_data_file('test.txt', data)
  • Related