Home > Software design >  Is there a numpy function to replace str to int values based on list index
Is there a numpy function to replace str to int values based on list index

Time:11-08

Is there way to accomplish the code below with out the for loop?

I’m assigning int values to str based on its index position.

import numpy as np
icing_types=[
    "none",
    "l-mixed",
    "l-rime",
    "l-clear",
    "m-mixed",
    "m-rime",
    "m-clear",
]
idx=np.array(icing_types)
#validtime=basetime index hr
forecast_icing=[
    "none", 
    "none", 
    "l-rime",
    "m-rime"
]
arr=np.array([np.where(ice==idx)for ice in forecast_icing]).flatten()

CodePudding user response:

Borrowed from this answer you can generate a map and vectorize the function. It's worth noting that the other answer here is faster.

import numpy as np
icing_types=[
    "none",
    "l-mixed",
    "l-rime",
    "l-clear",
    "m-mixed",
    "m-rime",
    "m-clear",
]


forecast_icing=np.array([
    "none", 
    "none", 
    "l-rime",
    "m-rime"
])

np.vectorize({b:a for a,b in enumerate(icing_types)}.get)(forecast_icing)

CodePudding user response:

In [103]: arr = np.array(forecast_icing)
In [104]: idx,arr
Out[104]: 
(array(['none', 'l-mixed', 'l-rime', 'l-clear', 'm-mixed', 'm-rime',
        'm-clear'], dtype='<U7'),
 array(['none', 'none', 'l-rime', 'm-rime'], dtype='<U6'))

We can test the 2 arrays against each other with:

In [105]: arr[:,None]==idx
Out[105]: 
array([[ True, False, False, False, False, False, False],
       [ True, False, False, False, False, False, False],
       [False, False,  True, False, False, False, False],
       [False, False, False, False, False,  True, False]])

The indicies of the True are:

In [106]: np.where(_)
Out[106]: (array([0, 1, 2, 3]), array([0, 0, 2, 5]))

The 2nd array gives the matches of arr in idx:

In [107]: _[1]
Out[107]: array([0, 0, 2, 5])

If you can guarantee a one and only one match, I don't think you need to do anything fancier.

CodePudding user response:

There is not, but I am not 100% sure.

Anyway, even if there was, you can get much better performance by using mapping dict (icing_type -> idx):

icing_type_to_idx = {icinig_type: idx for idx, icing_type in enumerate(icing_types)}
arr = np.array([icing_type_to_idx[ice] for ice in forecast_icing])
  • Related