In below code I would like to find an alternative way of reinitializing lists.
iter_obj_main has over 200000 iter_obj and 1 iter_obj has over 1 million rows of data and over 50 columns. (in below example I use just 3 columns a[], b [], c[] for demonstration) This makes code look very long and ugly.
I am actually looking to empty all 50 lists after every loop?
Any suggestion python gurus?
i = 0
for iter_obj in iter_obj_main:
for x in iter_obj:
i = 1
a,b,c = ([] for j in range(3))
if x == sometest:
a.insert(i,x[0])
else:
a.insert(i,'')
if x == sometest:
b.insert(i,x[1])
else:
b.insert(i,'')
if x == sometest:
c.insert(i,x[2])
else:
c.insert(i,'')
# moving data to database because of memory limitations and clearing lists.
CodePudding user response:
Assigning twice to the same variable is a waste of time. There is no use in doing a = []
when you follow it by another assignment a = x[0]
.
Secondly, if x
is always a list with 3 values, you could unpack it immediately:
for iter_obj in iter_obj_main:
for a, b, c in iter_obj:
# write to the database
Now the writing to the database would be the bottleneck anyway, so the above optimisation is not going to make it run significantly faster. You should focus on how you can write bulk data to your database without having to send instructions one-by-one.
Secondly, if your database table has 50 columns, it is very likely you have a design flaw in your database schema. It is very likely you didn't normalise it.
CodePudding user response:
maybe use:
for iter_obj in iter_obj_main:
for x in iter_obj:
a, b, c, *_ = x