I must be missing something regarding how simultaneous requests are handled by PHP/Symfony, or perhaps how potentially simultaneous queries on the DB are handled...
This code seems to be doing the impossible - it randomly (about once a month) creates a duplicate of the new entity at the bottom. I conclude that it must happen when two clients make the same request twice, and both threads execute the SELECT query at the same time, picking up the entry where stop == NULL, and then they both (?) set the stoptime for that entry, and they both write a new entry.
Here's my logical outline as far as I understand things:
- Get all entries with NULL for stoptime
- Loop over those entries
- Only proceed if the day of the entry (UTC) is different from current day (UTC)
- Set stoptime for the open entry to 23:59:59 and flush to DB
- Build a new entry with the starttime 00:00:00 on the next day
- Assert that there are no other open entries in that position
- Assert that there are no future entries in that position
- Only then - flush the new entry to DB
Controller autocloseAndOpen
//if entry spans daybreak (midnight) close it and open a new entry at the beginning of next day
private function autocloseAndOpen($units) {
$now = new \DateTime("now", new \DateTimeZone("UTC"));
$repository = $this->em->getRepository('App\Entity\Poslog\Entry');
$query = $repository->createQueryBuilder('e')
->where('e.stop is NULL')
->getQuery();
$results = $query->getResult();
if (!isset($results[0])) {
return null; //there are no open entries at all
}
$em = $this->em;
$messages = "";
foreach ($results as $r) {
if ($r->getPosition()->getACRGroup() == $unit) { //only touch the user's own entries
$start = $r->getStart();
//Assert entry spanning datebreak
$startStr = $start->format("Y-m-d"); //Necessary for comparison, if $start->format("Y-m-d") is put in the comparison clause PHP will still compare the datetime object being formatted, not the output of the formatting.
$nowStr = $now->format("Y-m-d"); //Necessary for comparison, if $start->format("Y-m-d") is put in the comparison clause PHP will still compare the datetime object being formatted, not the output of the formatting.
if ($startStr < $nowStr) {
$stop = new \DateTimeImmutable($start->format("Y-m-d")."23:59:59", new \DateTimeZone("UTC"));
$r->setStop($stop);
$em->flush();
$txt = $unit->getName() . " had an entry in position (" . $r->getPosition()->getName() . ") spanning datebreak (UTC). Automatically closed at " . $stop->format("Y-m-d H:i:s") . "z.";
$messages .= "<p>" . $txt . "</p>";
//Open new entry
$newStartTime = $stop->modify(' 1 second');
$entry = new Entry();
$entry->setStart( $newStartTime );
$entry->setOperator( $r->getOperator() );
$entry->setPosition( $r->getPosition() );
$entry->setStudent( $r->getStudent() );
$em->persist($entry);
//Assert that there are no future entries before autoopening a new entry
$futureE = $this->checkFutureEntries($r->getPosition(),true);
$openE = $this->checkOpenEntries($r->getPosition(), true);
if ($futureE !== 0 || $openE !== 0) {
$txt = "Tried to open a new entry for " . $r->getOperator()->getSignature() . " in the same position (" . $r->getPosition()->getName() . ") next day but there are conflicting entries.";
$messages .= "<p>" . $txt . "</p>";
} else {
$em->flush(); //store to DB
$txt = "A new entry was opened for " . $r->getOperator()->getSignature() . " in the same position (" . $r->getPosition()->getName() . ")";
$messages .= "<p>" . $txt . "</p>";
}
}
}
}
return $messages;
}
I'm even running an extra check here with the checkOpenEntries() to see whether there is at this point any entries with stoptime == NULL in that position. Initially, I thought that to be superflous because I thought that if one request is running and operating on the DB, the other request will not start until the first is finished.
private function checkOpenEntries($position,$checkRelatives = false) {
$positionsToCheck = array();
if ($checkRelatives == true) {
$positionsToCheck = $position->getRelatedPositions();
$positionsToCheck[] = $position;
} else {
$positionsToCheck = array($position);
}
//Get all open entries for position
$repository = $this->em->getRepository('App\Entity\Poslog\Entry');
$query = $repository->createQueryBuilder('e')
->where('e.stop is NULL and e.position IN (:positions)')
->setParameter('positions', $positionsToCheck)
->getQuery();
$results = $query->getResult();
if(!isset($results[0])) {
return 0; //tells caller that there are no open entries
} else {
if (count($results) === 1) {
return $results[0]; //if exactly one open entry, return that object to caller
} else {
$body = 'Found more than 1 open log entry for position ' . $position->getName() . ' in ' . $position->getACRGroup()->getName() . ' this should not be possible, there appears to be corrupt data in the database.';
$this->email($body);
$output['success'] = false;
$output['message'] = $body . ' An automatic email has been sent to ' . $this->globalParameters->get('poslog-email-to') . ' to notify of the problem, manual inspection is required.';
$output['logdata'] = null;
return $this->prepareResponse($output);
}
}
}
Do I need to start this function with some kind of "Lock database" method to achieve what I am trying to do?
I have tested all functions and when I simulate all kinds of states (entry with NULL for stoptime even when it shouldn't be able to be etc.) it all works out. And the majority of time it all works nicely, but that one day somewhere in the middle of the month, this thing happens...
CodePudding user response:
You can never guarantee sequential order (or implicit exclusive access). Try that and you will dig yourself deeper and deeper.
As Matt and KIKO mentioned in the comments, you could use constraints and transactions and those should help immensely as your database will stay clean, but keep in mind your app needs to be able to catch errors produced by the database layer. Definitely worth trying first.
Another way of handling this is to enforce database/application level locking.
Database-level locking is more coarse and very unforgiving if you somewhere forget to release a lock (in long-running scripts).
MySQL docs:
If the connection for a client session terminates, whether normally or abnormally, the server implicitly releases all table locks held by the session (transactional and nontransactional). If the client reconnects, the locks are no longer in effect.
Locking the entire table is usually a bad idea altogether, but it is doable. This highly depends on the application.
Some ORMs, out-of-box, support object versioning and throw exceptions if the version has changed during the execution. In theory, your application would get an exception and upon retrying it would find that someone else already populated the field and that is no longer a candidate for an update.
Application-level locking is more fine-grained, but all points in the code need to honor the lock, otherwise, you are back to square #1. And if your app is distributed (say K8S, or just deployed across multiple servers), your locking mechanism has to be distributed as well (not local to an instance)