I was joined to project with a lot of changes frequency, developments have already been done in the project and it has looked like a monolith architecture! sometimes we are using some fields at the ancestors model in a child model to handle some kind of filtering and sorting features (I asked about it before and no solution was introduced to me other than using duplicate fields. LINK) therefore we need to update these ancestors fields when a parent was updating, regarding the project is large, I decided to use the model signals instead of finding all places and updating the source code to update these fields in the new model, I did this feature with _post_put_hook
and _pre_put_hook
signals of google ndb
model class.
class ParentOne(ndb.Model):
name_parent_one = ndb.StringProperty()
....
def _post_put_hook(self, future):
obj = JoinParentNameClasses.query(JoinParentNameClasses.parent_one == self.key).get()
obj.name_parent_one = self.name_parent_one
obj.put()
class ParentTwo(ndb.Model):
name_parent_two = ndb.StringProperty()
....
def _post_put_hook(self, future):
obj = JoinParentNameClasses.query(JoinParentNameClasses.parent_two == self.key).get()
obj.name_parent_two = self.name_parent_two
obj.put()
class JoinParentNameClasses(ndb.Model):
parent_one = ndb.KeyProperty(kind='ParentOne')
name_parent_one = ndb.StringProperty()
parent_two = ndb.KeyProperty(kind='ParentTwo')
name_parent_two = ndb.StringProperty()
... some other fields which was used for API ....
But now bigger problems have emerged: when someone tries to use put_multi
or put_multi_async
of google ndb
, NDB create a lot of future
objects and sends them for processing with ndb.tasklet
after that with .get_result()
gets latest update result, according to ndb.tasklet
implementation, it's seems like python coroutines
when we have a lot of rows which is needed to be updated then regarding _post_put_hook
that updated the children, create a lot of depth and maximum recursion depth exceeded
appears. How can I solve the problem?
Note. I know about sys.setrecursionlimit(1000)
but this isn't a good solution, I'm looking for best practice.
CodePudding user response:
Using _post_put_hook
seems to be asking for trouble... Serial puts is a bad idea because it is slow and increases datastore contention. You might have created an infinite loop as well.
Instead, you want to batch your puts together:
class ParentOne(ndb.Model):
def put_me(self):
obj = JoinParentNameClasses.query(JoinParentNameClasses.parent_one == self.key).get()
obj.name_parent_one = self.name_parent_one
ndb.put_multi([self, obj])
You mention that you are already using ndb.put_multi
which is triggering all the _post_put_hook
calls. Instead, write a custom function that
- receives a list of objects
- gets all the relevant parent objects
- puts them all with a single
ndb.put_multi