Home > Enterprise >  One-hot encoding of decimals with a fixed precision in python/numpy
One-hot encoding of decimals with a fixed precision in python/numpy

Time:04-02

In Python, I would like to achieve a one-hot encoding of decimal numbers with precision until 10^{-3}. Given that my input fractions are, e.g., interval = [0.433,0.223,0.111], it would produce a stack of one-hot vectors for each number in interval. So for the first float number 0.433 we should get 4 ---> 0 0 0 0 1 0 0 0 0 0 , 3 ---> 0 0 0 1 0 0 0 0 0 0, 3 ---> 0 0 0 1 0 0 0 0 0 0 and then concatenate the three obtaining a 30 dimensional one-hot array.

Also, I'm wondering about this: even if the input numbers are, let's say, [0.4,0.2,0.1], is there a way to apply the same technique as before? Like considering mathematically equivalent numbers [0.400,0.200,0.100]?

EDIT:

This is my proposed attempt. I'm not sure this is the best way to accomplish the result, adn additionally this not solves the case where we are given for example [0.4, 0.2] but we want to interpret it as [0.400, 0.200] .

def encode_digits(floats: list):
            
    decimals = []
    
    for f in floats:
        
        decimals.append(str(f).split('0.',1)[1])
        
    one_hot = []
        
    for i in range(2):
        
        for j in range(3):
            
            temp =np.zeros(10)
            
            temp[int(decimals[i][j])]  = 1
            
            one_hot.append(temp)
            
    return np.array(one_hot).reshape(-1)

CodePudding user response:

As suggested in the comment you can multiply by 1000 and convert to integer. Thereafter you can extract each individual digit from the numbers i and finally apply standard one-hot encoding:

i = (np.array(interval) * 1000).astype(int)
digits = i // 10 ** np.arange(len(i))[::-1, None] % 10
np.eye(10, dtype=int)[digits.T]

output:

array([[[0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
        [0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 1, 0, 0, 0, 0, 0, 0]],

       [[0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
        [0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 1, 0, 0, 0, 0, 0, 0]],

       [[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 1, 0, 0, 0, 0, 0, 0, 0, 0]]])

If you prefer to concatenate it into a 30 dimensional array you could do:

np.eye(10, dtype=int)[digits.T].reshape(3, 3 * 10)

output:

array([[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 1, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 1, 0, 0, 0, 0, 0, 0],
       [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
        0, 0, 0, 0, 0, 0, 0, 0]])
  • Related