Nested dictionary (with defaults) surpisingly slow updating/inserting a large number of values-CodePudding

I have some swift that looks like this:

import Foundation

var sequence = [Int]()
for _ in 0..<10_000 {
    let random = Int.random(in: 1...5)
    sequence.append(random)
}

var dict: [Int: [Int: Int]] = [:]
for (a, b) in zip(sequence, sequence[1...]) {
    dict[a, default: [:]][b, default: 0]  = 1
}

Basically, a nested dictionary that stores a counter for each element.

In a Playground it takes about ~10 seconds. And in a proper test suite with self.measure { ... } it's not that much faster.

Re-written in Python:

from collections import defaultdict
import random

sequence = []
for _ in range(10_000):
    rand = random.choice([1, 2, 3, 4, 5])
    sequence.append(rand)

my_dict = defaultdict(lambda: defaultdict(lambda: 0))
for a, b in zip(sequence, sequence[1:]):
    my_dict[a][b]  = 1

... it basically runs instantly.

So two questions:

What's happening here?! I thought swift dictionaries were O(1)?
Most importantly, this nested dictionary faster / more performant?

CodePudding user response：

The performance is because of all the copy-on-write behaviour your triggering with those mutations. See https://developer.apple.com/documentation/swift/dictionary/3127177-reduce for the reduce into method and the performance discussion there

CodePudding user response：

A bit faster with Shadowrun's help:

import Foundation

typealias NestedDict = [Int: [Int: Int]]

var sequence = (0..<10_000).map { _ in Int.random(in: 1...5) }

var dict = NestedDict()
dict = zip(sequence, sequence[1...]).reduce(into: NestedDict()) { partialResult, pair in
    partialResult[pair.0, default: [:]][pair.1, default: 0]  = 1
}