PyTorch: Checking Model Accuracy Results in "TypeError: 'bool' object is not iterable-CodePudding

I am training a neural network and would like to check its accuracy. I've used Librosa and SciKitLearn to represent audio in the form of 1D Numpy arrays. Thus x_train, x_test, y_train, and y_test are all 1D Numpy arrays with the x_* arrays containing floats and the y_* arrays containing strings corresponding to classes of data. For example:

x_train = [0.235, 1.101, 3.497]
y_train = ['happy', 'angry', 'neutral']

I've written a dictionary to represent these classes (strings) as integers:

emotions = {
'01' : 'neutral',
'02' : 'calm',
'03' : 'happy',
'04' : 'sad',
'05' : 'angry',
'06' : 'fearful',
'07' : 'disgust',
'08' : 'surprised'}

emotion_list = list(emotions.values())

Next I've defined a class to transform this data such that it can be passed to torch.utils.data.DataLoader():

class MakeDataset(Dataset):
    def __init__(self, x_train, y_train):
        self.x_train = torch.FloatTensor(x_train)
        self.y_train = torch.FloatTensor([emotion_list.index(each) for each in y_train])
    def __len__(self):
        return self.x_train.shape[0]
    def __getitem__(self, ind):
        x = self.x_train[ind]
        y = emotion_list.index(y_train[ind])
        return x, y

I define a training set, testing set, batch size, and load the data:

train_set = MakeDataset(x_train, y_train)
test_set = MakeDataset(x_test, y_test)

batch_size = 512

train_loader = DataLoader(train_set, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_set, batch_size=batch_size, shuffle=False)

I define the model, train, and test as follows:

class TwoLayerMLP(torch.nn.Module):
    def __init__(self, D_in, H, D_out):
        super(TwoLayerMLP, self).__init__()
        self.linear1 = torch.nn.Linear(D_in, H)
        self.linear2 = torch.nn.Linear(H, D_out)

    def forward(self, x):
        h_relu = self.linear1(x).clamp(min=0)
        y_pred = self.linear2(h_relu)
        return y_pred


model = TwoLayerMLP(180, 90, 8)
optimizer = torch.optim.Adam(model.parameters())
criterion = nn.CrossEntropyLoss()

epochs = 5000

total_train = 0
correct_train = 0
for epoch in range(epochs):
    model.train()
    running_loss = 0.0
    for batch_num, data in enumerate(train_loader):
        audio , label = data
        optimizer.zero_grad()
        outputs = model(audio.float())
        loss = criterion(outputs, label)
        loss.backward()
        optimizer.step()
        
        predicted = torch.max(outputs.data,1)
        total_train  = float(label.size(0))
        
        # Code runs with line below commented 
        # Else returns "TypeError: 'bool' object not iterable."
        correct_train  = sum(predicted == label)

Note that this code has been updated, formerly the problematic line was:

correct_train  = float((predicted == label)).sum()

Can anyone explain why this boolean object cannot be iterated as expected?

CodePudding user response：

You don't need to convert to float before summing, you can use:

(predicted == label).sum().item()

(predicted == label) returns a BoolTensor which can be summed to obtain a float value.

PS: it is weird that the float((predicted == label)) did not throw an error for you, on my machine with pytorch version 1.9.1 on running the above command on a tensor containing more than one element I get an error saying that a float conversion only works when the tensors contain only one element.

e.g.

tx = torch.ones(5)
ty = torch.ones(5)
c = float((tx == ty)).sum()

throws the error

----> 1 float((tx == ty))

ValueError: only one element tensors can be converted to Python scalars

Also there are a number of bugs in the code you copy paste to reproduced, I would double check to make sure that the reproduction code is runnable.

CodePudding user response：

Replace

correct_train  = float((predicted == label)).sum()

with

correct_train  = sum(predicted == label)

You don't need to convert boolean tensor to float, the sum function is smart enough to convert False to 0 and True to 1