Speed up the List appending process-CodePudding

I would like to speed up the process of the below code

x = []
y = []
z = []
for i in range(0, 1000000):
    if 0 < u[i] < 1920 and 0 < v[i] < 1080:
        x.append(u[i])
        y.append(v[i])
        z.append([x_ind[i], y_ind[i]])

Any ideas would be really appreciated. Thanks

CodePudding user response：

Typically, you can optimize cases like this by replacing loops over indices and indexing with loops over the raw values. So replacement code here would be:

x = []
y = []
z = []
for a, b, c in zip(u, v, ind):
    if 0 < a < 1920 and 0 < b < 1080:
        x.append(a)
        y.append(b)
        z.append([c, c])

If u, v and ind might be longer than 1000000 and you must stop at 1000000 items checked, you'd just add an import to the top of the file, from itertools import islice, and change the for loop itself to:

for a, b, c in islice(zip(u, v, ind), 1000000):

Either way, you remove all indexing from the code (indexing has one of the worst ratios of overhead to useful work accomplished in the CPython reference interpreter, though other interpreters and tools like Cython will behave differently) and, if you use nicer names than a, b and c, more self-documenting code.

There are minor benefits (decreasing in the most recent versions of Python) to pre-binding copies of append instead of dynamic binding, so if you're really hurting for speed, especially on older versions of Python that didn't optimize away the creation of bound methods, you can try:

x = []
y = []
z = []
xapp, yapp, zapp = x.append, y.append, z.append
for a, b, c in zip(u, v, ind):
    if 0 < a < 1920 and 0 < b < 1080:
        xapp(a)
        yapp(b)
        zapp([c, c])

(adding islice if needed) to reduce method call overhead a bit at the expense of uglier code. Definitely don't do this unless profiling has shown this is the hot code path and you really need it faster.

Lastly, a note: If this code is being run at top-level (outside of any function) it will run significantly slower (variable lookup for locally scoped names in a function is a C array lookup; looking up globally scoped names, which all lookups outside a function involve, involves at least one dict key lookup, which is substantially more expensive). Put it in a function (along with the definitions of x, y and z; u, v and ind don't matter so much if you're zipping rather than indexing them) and call that function instead of running at global scope and it should run a lot faster.

Improvements beyond this might be possible using numpy arrays instead of lists, but you'd need to be much more specific about your problem to hazard a guess on the utility of such a change.