Home > Software design >  Is amortized time complexity analysis broken for immutable colections?
Is amortized time complexity analysis broken for immutable colections?

Time:09-21

Update: this question is how can we use amortized analysis for immutable collections? Scala's immutable queue is just an example. How this immutable queue is implemented is clearly visible from sources. And as it was pointed in answers Scala's sources does not mention amortized time for it al all. But guides and internet podcasts do. And as I saw C# has the similar immutable queue with similar statements about amortized time for it.

The amortized analysis was invented originally for mutable collections. How we can apply it to a Scala's mutable Queue is clear. But how can we apply it to this sequence of operations for example?

val q0 = collection.immutable.Queue[Int]()
val q1 = q0.enqueue(1)
val h1 = q1.head
val q2 = q1.enqueue(2)
val h2 = q2.head
val (d2, q3) = q2.dequeue()
val (d1, q4) = q3.dequeue()

We have different immutable queues in sequence q0-q4. May we consider them as one single queue or not? How we can use O(1) enqueue operations to amortize both heavy head and the first dequeue? What method of amortized analysis we can use for this sequence? I don't know. I can not find anything in textbooks.

Original question:

According to an original definition the amortized time complexity is a worst case averaged over sequence complexity for allowed sequences of operations: "we average the running time per operation over a (worst-case) sequence of operations" https://courses.cs.duke.edu/fall11/cps234/reading/Tarjan85_AmortizedComplexity.pdf See textbooks also ("Introduction To Algorithms" by Cormen et al. for example)

Scala's Collection Library guide states two collection.immutable.Queue methods (head and tail) have amortized O(1) time complexity: https://docs.scala-lang.org/overviews/collections-2.13/performance-characteristics.html This table does not mention complexities of enqueue and dequeue operations but another unofficial guide states O(1) time complexity for enqueue and amortized O(1) time complexity for dequeue. https://www.waitingforcode.com/scala-collections/collections-complexity-scala-immutable-collections/read

But what that statements for the amortized time complexity really mean? Intuitively they allow us to make predictions for algorithms with the data structure used such as any allowed by the data structure itself sequence of N amortized O(1) operations have no worse than O(N) complexity for the sequence. Unfortunately this intuitive meaning is clearly broken for immutable collections. For example, next function does have time complexity O(n^2) for 2n amortized O(1) (according to the guides) operations:

def quadraticInTime(n: Int) = {
  var q = collection.immutable.Queue[Int]()
  for (i <- 1 to n) q = q.enqueue(i)
  List.fill(n)(q.head)
}

val tryIt = quadraticInTime(100000)

The second parameter of List.fill method is a by name parameter and is evaluated n times in sequence. We can also use q.dequeue._1 instead of q.head of course with the same result.

Also we can read in "Programming in Scala" by M. Odersky et al.: "Assuming that head, tail, and enqueue operations appear with about the same frequency, the amortized complexity is hence constant for each operation. So functional queues are asymptotically just as efficient as mutable ones." This contradicts to the worst case amortized complexity property from textbooks and wrong for quadraticInTime method above.

But if a data structure has O(1) time complexity of cloning we can break amortized time analysis assumptions for it just by executing N worst case "heavy" operations on N the data structure copies in sequence. And generality speaking any immutable collection have O(1) time complexity of cloning.

Question: is there a good formal definition of amortized time complexity for operations on immutable structures? The definition clearly must farther limit allowed sequences of operations to be useful.

CodePudding user response:

The docs for Queue give tighter conditions for which the asymptotic complexity holds:

Adding items to the queue always has cost O(1). Removing items has cost O(1), except in the case where a pivot is required, in which case, a cost of O(n) is incurred, where n is the number of elements in the queue. When this happens, n remove operations with O(1) cost are guaranteed. Removing an item is on average O(1).

Note that removing from an immutable queue implies that when dequeueing, subsequent operations are on the returned Queue. Not doing this also means that it's not actually being used as a queue:

val q = Queue.empty[Int].enqueue(0).enqueue(1)
q.dequeue()._1    // 1
q.dequeue()._1    // 1

In your code, you're not actually using the Queue as a queue. Addressing this:

def f(n: Int): Unit = {
  var q = Queue.empty[Int]
  (1 to n).foreach { i => q = q.enqueue(i) }
  List.fill(n) {
    val (head, nextQ) = q.dequeue
    q = nextQ
    head
  }
}

def time(block: => Unit): Unit = {
  val startNanos = System.nanoTime()
  block
  println(s"Block took ${ System.nanoTime() - startNanos }ns")
}
scala> time(f(10000))
Block took 2483235ns

scala> time(f(20000))
Block took 5039420ns

Note also the if we do an approximately equal number of enqueue and head operations on the same scala.collection.immutable.Queue, the head operations are in fact constant time even without amortization:

val q = Queue.empty[Int]
(1 to n).foreach { i => q.enqueue(i) }
List.fill(n)(Try(q.head))

CodePudding user response:

Chris Okasaki described how to solve this problem with lazy evaluation in Amortization, Lazy Evaluation, and Persistence: Lists with Catenation via Lazy Linking from 1995. The main idea is that you can guarantee that some action is done only once by hiding it in a thunk and letting the language runtime manage evaluating it exactly once.

  • Related