I am using the code below as a template for writing a data generator function in TensorFlow and I was wondering if the trailing comma below is necessary or useful:
def __data_generation(self, list_IDs_temp):
'Generates data containing batch_size samples' # X : (n_samples, *dim, n_channels)
# Initialization
X = np.empty((self.batch_size, *self.dim, self.n_channels))
y = np.empty((self.batch_size), dtype=int)
# Generate data
for i, ID in enumerate(list_IDs_temp):
# Store sample
X[i,] = np.load('data/' ID '.npy')
# Store class
y[i] = self.labels[ID]
return X, keras.utils.to_categorical(y, num_classes=self.n_classes)
Does the comma in X[i,]
do anything? I've searched high and low and run a bunch of tests with similar code in Jupyter and I can't find any difference between using or not using the comma.
CodePudding user response:
There is not much difference, except for the fact that adding the comma is just making part of the code redundant. But do note that the comma can slow down the program, see below:
>>> from timeit import timeit
>>> timeit('a[:3,]', 'import numpy as np; a = np.array([1, 2, 3, 4, 5])')
0.26200279999999765
>>> timeit('a[:3,]', 'import numpy as np; a = np.array([1, 2, 3, 4, 5])')
0.27410390000000007
>>> timeit('a[:3,]', 'import numpy as np; a = np.array([1, 2, 3, 4, 5])')
0.3642131000000006
>>> timeit('a[:3,]', 'import numpy as np; a = np.array([1, 2, 3, 4, 5])')
0.3105785999999995
>>> timeit('a[:3,]', 'import numpy as np; a = np.array([1, 2, 3, 4, 5])')
0.2766163000000006
>>> timeit('a[:3,]', 'import numpy as np; a = np.array([1, 2, 3, 4, 5])')
0.2650689999999969
>>> timeit('a[:3,]', 'import numpy as np; a = np.array([1, 2, 3, 4, 5])')
0.2776439999999951
>>> timeit('a[:3,]', 'import numpy as np; a = np.array([1, 2, 3, 4, 5])')
0.3056855999999968
>>> timeit('a[:3,]', 'import numpy as np; a = np.array([1, 2, 3, 4, 5])')
0.2718677000000014
>>> timeit('a[:3,]', 'import numpy as np; a = np.array([1, 2, 3, 4, 5])')
0.2666911999999968
>>> from timeit import timeit
>>> timeit('a[:3]', 'import numpy as np; a = np.array([1, 2, 3, 4, 5])')
0.25228500000000054
>>> timeit('a[:3]', 'import numpy as np; a = np.array([1, 2, 3, 4, 5])')
0.23471499999999423
>>> timeit('a[:3]', 'import numpy as np; a = np.array([1, 2, 3, 4, 5])')
0.3306362000000007
>>> timeit('a[:3]', 'import numpy as np; a = np.array([1, 2, 3, 4, 5])')
0.2560698000000059
>>> timeit('a[:3]', 'import numpy as np; a = np.array([1, 2, 3, 4, 5])')
0.2566029000000043
>>> timeit('a[:3]', 'import numpy as np; a = np.array([1, 2, 3, 4, 5])')
0.24175780000000202
>>> timeit('a[:3]', 'import numpy as np; a = np.array([1, 2, 3, 4, 5])')
0.23682909999999424
>>> timeit('a[:3]', 'import numpy as np; a = np.array([1, 2, 3, 4, 5])')
0.2400262999999967
>>> timeit('a[:3]', 'import numpy as np; a = np.array([1, 2, 3, 4, 5])')
0.2468849999999918
>>> timeit('a[:3]', 'import numpy as np; a = np.array([1, 2, 3, 4, 5])')
0.22863809999999773
For for more information (though I'm sure you already know this), adding a comma after an object that's between two brackets does make a difference, as with it, a tuple is created and without it, the brackets get dismissed:
a = (1)
print(a)
a = (1,)
print(a)
Output:
1
(1,)
CodePudding user response:
Timings for non-trivial assignments:
In [146]: x=np.zeros((100,100,100))
In [148]: y=np.arange(10000.).reshape(100,100)
In [149]: x[1]=y
In [150]: timeit x[1]=y
6.89 µs ± 64.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [151]: timeit x[1,]=y
6.89 µs ± 70.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [153]: timeit x[1,:,:]=y
7.12 µs ± 3.44 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
times for a trivial task:
In [154]: timeit x[1]
194 ns ± 3.79 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [155]: timeit x[1,]
205 ns ± 10.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Sometimes in my answers I include trailing colons, x[1,:,:]
. The code doesn't care, but it can help humans beware that it's accessing one of several dimensions.