The output of the code below is
fast: 0.018553733825683594
slow: 7.0305609703063965
and, more than that, the file of slow.dat is 10,252KB while fast.dat is only 32KB. Why is the fast one small.. fast and small while slow.... is so slow and big?
import shelve
import random
import time
start = time.time()
db = shelve.open('fast')
db["answers"] = []
answers = []
for i in range(1000):
answer = {
"foo0": random.randint(1,10),
"foo1": random.randint(1,10),
"foo2": random.randint(1,10),
"foo3": random.randint(1,10),
"foo4": random.randint(1,10),
"foo5": random.randint(1,10)
}
answers.append(answer)
db['answers'] = answers
db.close()
print("fast:", time.time() - start)
start = time.time()
db = shelve.open('slow') # slow and uses !!!!WAY MORE SPACE!!!!
db["answers"] = []
for i in range(1000):
answer = {
"foo0": random.randint(1,10),
"foo1": random.randint(1,10),
"foo2": random.randint(1,10),
"foo3": random.randint(1,10),
"foo4": random.randint(1,10),
"foo5": random.randint(1,10)
}
db['answers'] = db['answers'] [answer]
db.close()
print("slow:", time.time() - start)
CodePudding user response:
The docs say that shelve has problems with knowing whether a mutable structure has been modified. They suggest using writeback=True
which caches the structures in memory and writes them on .sync
and .close
.
This improved the required time and space by little but OP said that also using .append
on these lists solves the problem.
If there are still problems I would suggest using a better-suited database to your situation than shelve
.
CodePudding user response:
In the first example you are appending to a list, in the second you are are creating a new list in each iteration of the loop.
mylist.append(obj)
-> add to existing list
mylist = mylist [obj]
-> combines two lists to make a new list.
Creating a new list each time is more expensive than just appending.
mylist = [obj]
is probably what you want - this is the same as appending - same action, different syntax.