Remove character from a value in a Numpy array-CodePudding

i am trying to remove all the character beside the last 4 from all the values in a numpy array. I'd normally use [-4:] but if i use that on the arra i only obtain the last 4 values in the array.

andatum = andatum[-4:] print(andatum)

'15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999']

runfile('O:/GIS/GEP/Risikomanagement/Flussvermessung/ALD/Analyses/ReadFilesToRawData.py', wdir='O:/GIS/GEP/Risikomanagement/Flussvermessung/ALD/Analyses') ['15.11.1999' '15.11.1999' '15.11.1999' '15.11.1999']

What i am trying to do is to obtain the same array but only with the last 4 digits (the year). Any idea how i could do that?

Thank you,

Davide

I would like to remove all the characters beside the last 4 (the year) but using [-4:] i get the last 4 entries of my numpy array.

CodePudding user response：

Looks like you have a 1d array of strings:

In [28]: arr = np.array(['15.11.1999']*6)    
In [29]: arr
Out[29]: 
array(['15.11.1999', '15.11.1999', '15.11.1999', '15.11.1999',
       '15.11.1999', '15.11.1999'], dtype='<U10')

numpy is better for numbers than strings. This array is little better than a list of strings. But for convenience, numpy has a set of functions that apply string methods to the elements of an array.

In [30]: np.char.split(arr, sep='.')
Out[30]: 
array([list(['15', '11', '1999']), list(['15', '11', '1999']),
       list(['15', '11', '1999']), list(['15', '11', '1999']),
       list(['15', '11', '1999']), list(['15', '11', '1999'])],
      dtype=object)

We can convert this to a 2d array of strings with stack (or vstack):

In [31]: np.stack(_)
Out[31]: 
array([['15', '11', '1999'],
       ['15', '11', '1999'],
       ['15', '11', '1999'],
       ['15', '11', '1999'],
       ['15', '11', '1999'],
       ['15', '11', '1999']], dtype='<U4')

And select a column:

In [32]: np.stack(_)[:,2]
Out[32]: array(['1999', '1999', '1999', '1999', '1999', '1999'], dtype='<U4')

np.char does not have a function to index the strings. For that we have to stick with a list comprehension

In [33]: [i[-4:] for i in arr]
Out[33]: ['1999', '1999', '1999', '1999', '1999', '1999']

That kind of iteration is faster with lists.

CodePudding user response：

andatum[i] will reference items in the array. To reference individual characters of these items, you need to use multiple brackets like this andatum[i][x]

To get array of only last 4 characters you need to go over each item of the array like this:

for i in range(len(andatum)):
    andatum[i] = andatum[i][:-4]

Or to keep things more tidy and also faster, this oneliner should also do the work:

andatum = [x[:-4] for x in andatum]