I would like to speed up the process of the below code
x = []
y = []
z = []
for i in range(0, 1000000):
if 0 < u[i] < 1920 and 0 < v[i] < 1080:
x.append(u[i])
y.append(v[i])
z.append([x_ind[i], y_ind[i]])
Any ideas would be really appreciated. Thanks
CodePudding user response:
Typically, you can optimize cases like this by replacing loops over indices and indexing with loops over the raw values. So replacement code here would be:
x = []
y = []
z = []
for a, b, c in zip(u, v, ind):
if 0 < a < 1920 and 0 < b < 1080:
x.append(a)
y.append(b)
z.append([c, c])
If u
, v
and ind
might be longer than 1000000
and you must stop at 1000000
items checked, you'd just add an import to the top of the file, from itertools import islice
, and change the for
loop itself to:
for a, b, c in islice(zip(u, v, ind), 1000000):
Either way, you remove all indexing from the code (indexing has one of the worst ratios of overhead to useful work accomplished in the CPython reference interpreter, though other interpreters and tools like Cython will behave differently) and, if you use nicer names than a
, b
and c
, more self-documenting code.
There are minor benefits (decreasing in the most recent versions of Python) to pre-binding copies of append
instead of dynamic binding, so if you're really hurting for speed, especially on older versions of Python that didn't optimize away the creation of bound methods, you can try:
x = []
y = []
z = []
xapp, yapp, zapp = x.append, y.append, z.append
for a, b, c in zip(u, v, ind):
if 0 < a < 1920 and 0 < b < 1080:
xapp(a)
yapp(b)
zapp([c, c])
(adding islice
if needed) to reduce method call overhead a bit at the expense of uglier code. Definitely don't do this unless profiling has shown this is the hot code path and you really need it faster.
Lastly, a note: If this code is being run at top-level (outside of any function) it will run significantly slower (variable lookup for locally scoped names in a function is a C array lookup; looking up globally scoped names, which all lookups outside a function involve, involves at least one dict
key lookup, which is substantially more expensive). Put it in a function (along with the definitions of x
, y
and z
; u
, v
and ind
don't matter so much if you're zip
ping rather than indexing them) and call that function instead of running at global scope and it should run a lot faster.
Improvements beyond this might be possible using numpy
arrays instead of list
s, but you'd need to be much more specific about your problem to hazard a guess on the utility of such a change.