In principle I'd like to do the opposite of what was done here https://datascience.stackexchange.com/questions/45916/loading-own-train-data-and-labels-in-dataloader-using-pytorch.
I have a Pytorch dataloader train_dataloader
with shape (2000,3)
. I want to store the 3 dataloader columns in 3 separate numpy arrays. (The first column of the dataloader contains the data, the second column contains the labels.)
I managed to do it for the last batch of the train_dataloader
(see below), but unfortunately couldn't make it work for the whole train_dataloader
.
for X, y, ind in train_dataloader:
pass
train_X = np.asarray(X, dtype=np.float32)
train_y = np.asarray(y, dtype=np.float32)
Any help would be very much appreciated!
CodePudding user response:
You can collect all the data:
all_X = []
all_y = []
for X, y, ind in train_dataloader:
all_X.append(X)
all_y.append(y)
train_X = torch.cat(all_X, dim=0).numpy()
train_y = torch.cat(all_y, dim=0).numpy()