How to make one-hot data compatible with non one-hot?-CodePudding

I'm making a machine learning model to calculate game win rate on different character combination. I got error at last line using loss function. I think it's because the input is one-hot vector. The output of the model doesn't compatile with target data. Because target data is just boolean value, win or lose. Please give me advice to get through this problem. How to make one-hot input compatible with non one-hot?

'''for example, when the number of character is 4 and eahc team member is 2.
   x_data is [ [[0,0,1,0], [0,1,0,0], [1,0,0,0,],[0,1,0,0]],  [game2]...]
                team A1,    temaA2,     temaB1     teamB2
'''


y_data = [[0], [0], [0], [1], [1], [1]] # team blue win: 1, lose : 0
x_train = torch.FloatTensor(x_data)
y_train = torch.FloatTensor(y_data)

class BinaryClassifier(nn.Module):
    def __init__(self):
        super(BinaryClassifier, self).__init__()

        self.layer1 = nn.Sequential(
            nn.Linear(in_features=num_characters, out_features=10, bias=True),
            nn.ReLU(), 
            )

        self.layer2 = nn.Sequential(
            nn.Linear(in_features=10, out_features=1, bias=True),
            nn.Sigmoid(), 
            )
    
    def forward(self, x):
        x = self.layer1(x) 
        x = self.layer2(x)
        return torch.sigmoid(x)

model = BinaryClassifier()
optimizer = optim.SGD(model.parameters(), lr=1)

nb_epochs = 1000
for epoch in range(nb_epochs   1):

    hypothesis = model(x_train)
    cost = nn.BCELoss(hypothesis, y_train)

 # RuntimeError: bool value of Tensor with more than one value is ambiguous

CodePudding user response：

First, your issue is not about One-hot encoding, because the output of your model is a number and Y_data is 0-1, so they're compatible. Your problem is about instantiating the loss. Therefore, you have to instantiate the loss and then pass arguments:

...
model = BinaryClassifier()
optimizer = torch.optim.SGD(model.parameters(), lr=1)
loss = nn.BCELoss()

nb_epochs = 1000
for epoch in range(nb_epochs   1):

    hypothesis = model(x_train)
    cost = loss(hypothesis, y_train)

About your x_data, if your data is like:

[[0,0,1,0], [0,1,0,0], [1,0,0,0,], [0,1,0,0],...]

in self.layer1 you should specify in_features with 4.

If x_data is like:

[ [[0,0,1,0], [0,1,0,0], [1,0,0,0,], [0,1,0,0]], [[0,0,1,0], [0,1,0,0], [1,0,0,0,], [0,1,0,0]], ...]

and you want to use a Linear layer, you have to flatten each sample because a linear layer accepts 1-dim input.

For example, the above would be:

[[0,0,1,0,0,1,0,0,1,0,0,0,0,1,0,0], [0,0,1,0,0,1,0,0,1,0,0,0,0,1,0,0], ...]

and in_features=16.

For your information, you can use CNN (Convolutional Neural Net) for 2 and more dimensions inputs, and for series inputs, you can use RNN (Recurrent neural network).

Hope it can be helpful.