Home > other >  Deep Learning how to split 5 dimensions timeseries and pass some dimensions through embedding layer
Deep Learning how to split 5 dimensions timeseries and pass some dimensions through embedding layer

Time:12-12

I have an input that is a time series of 5 dimensions:

a = [[8,3],[1,0,0] , [4,5],[0,0,1], ...] #total 100 timestamps. For each element, dims 0,1 are numerical data and dims 2,3,4 are one-hot encoded categories. This is per sample, 3200 samples

I want to build a NN such that the last 3 dimensions (the one-hot encoded categories) will go through an embedding layer with output size 8, and then will be concatenated back to the first two dims (the numerical data).

So, this will be something like:

input1 = keras.layers.Input(shape=(2,)) #the numerical features
input2 = keras.layers.Input(shape=(3,)) #the one-hot encoding of the categories. this part will be embedded to 5 dims
x2 = Embedding(input_dim=3, output_dim = 8)(input2) #apply it to every timestamp and take only dims 3-5, so [1,0,0],[0,0,1] 
x = concatenate([input1,x2]) #will get 10 dims at each timepoint, still 100 timepoints
x = LSTM(units=24)(x) #the input has 10 dims/features at each timepoint, total 100 timepoints per sample
x = Dense(1, activation='sigmoid')(x)
model = Model(inputs=[input1, input2] , outputs=[x]) #input1 is 1D vec of the width 2 , input2 is 1D vec with the width 3 and it is going through the embedding
model.compile(
        loss='binary_crossentropy',
        optimizer='adam',
        metrics=['acc']
    )

How can I do it? (preferably in keras)? My problem is how to apply the embedding to every time point? Meaning, if I have 1000 timepoints with 3 dims each, I need to convert it to 1000 timepoints with 8 dims each (The emebedding layer should transform input2 from (1000X3) to (1000X8)

CodePudding user response:

There are a couple of issues you are having here. First let me give you a working example and explain along the way how to solve your issues.

Imports and Data Generation

import tensorflow as tf
import numpy as np

from tensorflow.keras import layers
from tensorflow.keras.models import Model

num_timesteps = 100
num_features = 5
num_observations = 2

input_list = [[[np.random.randint(1, 100) for _ in range(num_timesteps)]
   for _ in range(num_features)]
    for _ in range(num_observations)]

input_arr = np.array(input_list)  # shape (2, 5, 100)

In order to use an embedding we need to the voc_size as input_dimension, as stated in the LSTM documentation.

Embedding and Concatenation

voc_size = len(np.unique(input_arr[: 2:, :].ravel()))   1

Now we need to create the inputs. Inputs should be of size [None, 2, num_timesteps] and [None, 3, num_timesteps] where the first dimension is the flexible and will be filled with the number of observations we are passing in. Let's use the embedding right after that using the previously calculated voc_size.

inp1 = layers.Input(shape=(2, num_timesteps))  # TensorShape([None, 2, 100])
inp2 = layers.Input(shape=(3, num_timesteps))  # TensorShape([None, 3, 100])
x2 = layers.Embedding(input_dim=voc_size, output_dim=8)(inp2)  # TensorShape([None, 3, 100, 8])

This cannot be easily concatenated since all dimensions must match except for the one along the concatenation axis. But here we have two dimensions which are not matching (dim 1 and dim 3). But there is a solution to this: We'll create a list of 5 tensors where each has the dimension (None, 100, x) where x can vary among the tensors and therefore dim 2 will be our concatenation axis.

tensorlist = [inp1[:, i, :, tf.newaxis] for i in range(2)]   [x2[:,i,:,:] for i in range(3)]

Because inp1 has one fewer dimension than x2 we first need to extend its dimensionality with tf.newaxis which just transforms the tensor of shape (None, 100, 2) to (None, 100, 2, 1). Now we can concatenate without any issue and everything works in a straight forward fashion:

x = layers.concatenate(tensorlist, axis=2)
x = layers.LSTM(128)(x)
x = layers.Dense(1, activation='sigmoid')(x)
model = Model(inputs=[inp1, inp2], outputs=[x])

Check on Dummy Example

inp1_np = input_arr[:, :2, :]
inp2_np = input_arr[:, 2:, :]
model.predict([inp1_np, inp2_np])

# Output
# array([[0.544262 ],
#       [0.6157502]], dtype=float32)

#This outputs values between 0 and 1 just as expected.

CodePudding user response:

In case you are not looking for Embeddings the way it's usually used in Keras (positive integers mapping to dense vectors). You might be looking for some sort of unprojection or basis expansion, in which 3 dimensions get mapped (embedded) to 8 and concatenating the result. This can be done using the kernel trick or other methods, but also happens implicitly in neural networks with non-linear applications.

As such, you can do something like this, following a similar format to pythonic833 because it was good (but with timestamps in the middle per the Keras LSTM documentation asking for [batch, timesteps, feature]):

Input generation

import tensorflow as tf
import numpy as np

from tensorflow.keras import layers
from tensorflow.keras.models import Model

num_timesteps = 100
num_features = 5
num_observations = 2

input_list = [[[np.random.randint(1, 100) for _ in range(num_features)]
   for _ in range(num_timesteps)]
    for _ in range(num_observations)]

input_arr = np.array(input_list)  # shape (2, 100, 5)

Model construction

Then you can process the inputs:

input1 = layers.Input(shape=(num_timesteps, 2,))
input2 = layers.Input(shape=(num_timesteps, 3))
x2 = layers.Dense(8, activation='relu')(input2)
x = layers.concatenate([input1,x2], axis=2) # This produces tensors of shape (None, 100, 10)
x = layers.LSTM(units=24)(x)
x = layers.Dense(1, activation='sigmoid')(x)
model = Model(inputs=[input1, input2] , outputs=[x])
model.compile(
    loss='binary_crossentropy',
    optimizer='adam',
    metrics=['acc']
)

Results

inp1_np = input_arr[:, :, :2]
inp2_np = input_arr[:, :, 2:]
model.predict([inp1_np, inp2_np])

which produces

array([[0.44117224],
       [0.23611131]], dtype=float32)

Other explanations about basis expansion to check out:

  1. https://stats.stackexchange.com/questions/527258/embedding-data-into-a-larger-dimension-space
  2. https://www.reddit.com/r/MachineLearning/comments/2ffejw/why_dont_researchers_use_the_kernel_method_in/
  • Related