Google colab memory issue-CodePudding

I am trying to understand why the following code crashes my Colab session.

import numpy as np
import tensorflow as tf


x1 = np.random.rand(90000)

x2 = tf.random.uniform((90000,1)).numpy()

print(x1.shape, type(x1)) 
print(x2.shape, type(x2))

x1 - x2

I can see that memory is exploding which causes the crash but I was hoping someone can explain exactly why this is happening. I also understand that this has to do with broadcasting arrays in numpy and I am just wondering if this is expected behavior so I can avoid it in the future.

The fix is to np.squeze(x2, axis=1) so the vectors have the same shape but clearly there's something I don't understand about what numpy is doing under the hood. Any suggestions and clarifications welcome.

CodePudding user response：

x1 has shape (90000,). x2 has shape (90000, 1). In the expression x1 - x2, broadcasting occurs (as you suspected), giving a result that has shape (90000, 90000). Such an array of floating point values requires 90000*90000*8 = 64800000000 bytes.