Home > Mobile >  How can I convert several NumPy arrays with ints to a NumPy array with formatted strings?
How can I convert several NumPy arrays with ints to a NumPy array with formatted strings?

Time:11-23

I have three arrays of the same length containing integers: years, months and days. I want to create a (NumPy) array of the same length, containing formatted strings like '(-)yyyy-mm-dd' using the format '%i-%2.2i-%2.2i'.

For the scalar case, I would do something like

year=2000; month=1; day=1
datestr = '%i-%2.2i-%2.2i' % (year, month, day)

which would yield '2000-01-01'.

How can I create the vector version of this, e.g.:

import numpy as np
years  = np.array([-1000, 0, 1000, 2000])
months = np.array([1, 2, 3, 5])
days   = np.array([1, 11, 21, 31])
datestr_array = numpy.somefunction(years, months, days, format='%i-%2.2i-%2.2i', ???)

Note that the date range I am interested in lies between the years -2000 and 3000 (CE), and hence both Python's datetime and Pandas' DateTimeIndex offer no solution.

CodePudding user response:

Explanation

Let's create a function that will convert any date without bounds to a yyyy-mm-dd string. We can use string formatting, where we create a predefined string and simply format in the relevant data. We also need to format the length to have zeros at the front to 'fill it out', i.e. 2001-05-20.

To be able to run this function, all the respective years months and days must be grouped together, which can be achieved with a zip function, which groups rows between columns as tuples. Preferably, we will convert this to a numpy array.

Now that we have the data in the correct tupled form, let's parse it through our function. We can create a new array that does this using apply, namely numpy.apply_on_axis(func, axis, data). Because the tuples are in the second axis, the axis parameter must be set to 1.

Code

def FormatDate(data):
    # Where data is a tuple for y, m, d
    return "{0:04}-{1:02}-{2:02}".format(data[0], data[1], data[2]) # Note that this formatting can later be update to account for some weirdness

# Convert the data into tuples where y, m, d are aligned in rows
converted = numpy.array(list(zip(years, months, days)))

# Now, lets apply that function to make the tuples all dates
datestr_array = numpy.apply_along_axis(FormatDate, 1, converted)
  • Related