Home > Net >  Faster HashMap for sequential keys
Faster HashMap for sequential keys

Time:01-02

Initially I was very surprised to find out Rust's HashMap, even with the FNV hasher, was considerably slower than the equivalents in Java, .NET, PHP. I am talking about optimized Release mode, not Debug mode. I did some calculations and realized the timings in Java/.NET/PHP were suspiciously low. Then it hit me - even though I was testing with a big hash table (millions of entries), I was reading mostly sequential key values (like 14, 15, 16, ...), which apparently resulted in lots of CPU cache hits, due to the way the standard hash tables (and hash-code functions for integers and short strings) in those languages are implementated, so that entries with nearby keys are usually located in nearby memory locations.

Rust's HashMap, on the other hand, uses the so called SwissTable implementation, which apparently distributes values differently. When I tested reading by random keys, everything fell into place - the "competitors" scored behind Rust.

So if we are in a situation where we need to perform lots of gets sequentially, for example iterating some DB IDs that are ordered and mostly sequential (with not too many gaps), is there a good Rust hash map implementation that can compete with Java's HashMap or .NET's Dictionary?


P.S. As requested in the comments, I paste an example here. I ran lots of tests, but here is a simple example that takes 75 ms in Rust (release mode) and 20 ms in Java:

In Rust:

let hm: FnvHashMap<i32, i32> = ...;  
// Start timer here
let mut sum: i64 = 0;
for i in 0..1_000_000 {
    if let Some(x) = hm.get(&i) {
        sum  = *x as i64;
    }
}
println!("The sum is: {}", sum);

In Java:

Map<Integer, Integer> hm = ...;
// Start timer here
long sum = 0;
for (int i = 0; i < 1_000_000; i  ) {
    sum  = hm.get(i);
}

With HashMap<i32, i32> and its default SipHash hasher it took 190 ms. I know why it's slower than FnvHashMap. I'm just mentioning that for completeness.

CodePudding user response:

First, here is some runnable code to measure the efficiency of the different implementations:

use std::{collections::HashMap, time::Instant};

fn main() {
    let hm: HashMap<i32, i32> = (0..1_000_000).map(|i| (i, i)).collect();
    let t0 = Instant::now();
    let mut sum = 0;
    for i in 0..1_000_000 {
        if let Some(x) = hm.get(&i) {
            sum  = x;
        }
    }
    let elapsed = t0.elapsed().as_secs_f64();
    println!("{} - The sum is: {}", elapsed, sum);
}

On the old desktop machine I'm writing this on, it reports 76 ms to run. Since the machine is 10 years old, I find it baffling that your hardware would take 190 ms to run the same code, so I'm wondering how and what you're actually measuring. But let's ignore that and concentrate on the relative numbers.

When you care about hashmap efficiency in Rust, and when the keys don't come from an untrusted source, the first thing to try should always be to switch to a non-DOS-resistant hash function. One possibility is the FNV hash function from the fnv crate, which you can get by switching HashMap to fnv::FnvHashMap. That brings performance to 34 ms, i.e. a 2.2x speedup.

If this is not enough, you can try the hash from the rustc-hash crate (almost the same as fxhash, but allegedly better maintained), which uses the same function as the Rust compiler, adapted from the hash used by Firefox. While not based on any widely-known algorithm, it reportedly consistently outperforms FNV. That's confirmed on the above example, where switching from FnvHashMap to rustc_hash::FxHashMap drops the time to 28 ms, i.e. a 2.7x speedup from the initial timing.

Finally, if you want to just imitate what C# and Java do, and could care less about certain patterns of inserted numbers leading to degraded performance, you can use the aptly-named nohash_hasher crate that gives you an identity hash. Changing HashMap<i32, i32> to HashMap<i32, i32, nohash_hasher::BuildNoHashHasher<i32>> drops the time to just under 4 ms, i.e. a whopping 19x speedup from the initial timing.

Since you report the Java example to be 9.5x faster than Rust, a 19x speedup should make your code approximately twice as fast as Java.

CodePudding user response:

Rust's HashMap by default uses an implementation of SipHash as the hash function. SipHash is designed to avoid denial-of-service attacks based on predicting hash collisions, which is an important security property for a hash function used in a hash map.

If you don't need this guarantee, you can use a simpler hash function. One option is using the fxhash crate, which should speed up reading integers from a HashMap<i32, i32> by about a factor of 3.

Other options are implementing your own trivial hash function (e.g. by simply using the identity function, which is a decent hash function for mostly consecutive keys), or using a vector instead of a hash map.

.NET uses the identity function for hashes of Int32 by default, so it's not resistant to hash flooding attacks. Of course this is faster, but the downside is not even mentioned in the documentation of Dictionary. For what it's worth, I prefer Rust's "safe by default" approach over .NET's any day, since many developers aren't even aware of the problems predictable hash functions can cause. Rust still allows you to use a more performant hash function if you don't need the hash flooding protection, so to me personally this seems to be a strength of Rust compared to at least .NET rather than a weakness.

CodePudding user response:

I decided to run some more tests, based on the suggestions by user4815162342. This time I used another machine with Ubuntu 20.04.

Rust code

println!("----- HashMap (with its default SipHash hasher) -----------");
let hm: HashMap<i32, i32> = (0..1_000_000).map(|i| (i, i)).collect();
for k in 0..6 {
    let t0 = Instant::now();
    let mut sum: i64 = 0;
    for i in 0..1_000_000 {
        if let Some(x) = hm.get(&i) {
            sum  = *x as i64;
        }
    }
    let elapsed = t0.elapsed().as_secs_f64();
    println!("The sum is: {}. Time elapsed: {:.3} sec", sum, elapsed);
}

println!("----- FnvHashMap (fnv 1.0.7) ------------------------------");
let hm: FnvHashMap<i32, i32> = (0..1_000_000).map(|i| (i, i)).collect();
for k in 0..6 {
    let t0 = Instant::now();
    let mut sum: i64 = 0;
    for i in 0..1_000_000 {
        if let Some(x) = hm.get(&i) {
            sum  = *x as i64;
        }
    }
    let elapsed = t0.elapsed().as_secs_f64();
    println!("The sum is: {}. Time elapsed: {:.3} sec", sum, elapsed);
}

println!("----- FxHashMap (rustc-hash 1.1.0) ------------------------");
let hm: FxHashMap<i32, i32> = (0..1_000_000).map(|i| (i, i)).collect();
for k in 0..6 {
    let t0 = Instant::now();
    let mut sum: i64 = 0;
    for i in 0..1_000_000 {
        if let Some(x) = hm.get(&i) {
            sum  = *x as i64;
        }
    }
    let elapsed = t0.elapsed().as_secs_f64();
    println!("The sum is: {}. Time elapsed: {:.3} sec", sum, elapsed);
}

println!("----- HashMap/BuildNoHashHasher (nohash-hasher 0.2.0) -----");
let hm: HashMap<i32, i32, nohash_hasher::BuildNoHashHasher<i32>> = (0..1_000_000).map(|i| (i, i)).collect();
for k in 0..6 {
    let t0 = Instant::now();
    let mut sum: i64 = 0;
    for i in 0..1_000_000 {
        if let Some(x) = hm.get(&i) {
            sum  = *x as i64;
        }
    }
    let elapsed = t0.elapsed().as_secs_f64();
    println!("The sum is: {}. Time elapsed: {:.3} sec", sum, elapsed);
}

BTW the last one can be replaced with this shorter type:

let hm: IntMap<i32, i32> = (0..1_000_000).map(|i| (i, i)).collect();

For those interested, this is IntMap's definition:

pub type IntMap<K, V> = std::collections::HashMap<K, V, BuildNoHashHasher<K>>;

Java code

On the same machine I tested a Java example. I don't have a JVM installed on it, so I used a Docker image adoptopenjdk/openjdk14 and directly pasted the code below in jshell> (not sure if that would hurt Java's timings). So this is the Java code:

Map<Integer, Integer> hm = new HashMap<>();
for (int i = 0; i < 1_000_000; i  ) {
    hm.put(i, i);
}
for (int k = 0; k < 6; k  ) {
    long t0 = System.currentTimeMillis();
    // Start timer here
    long sum = 0;
    for (int i = 0; i < 1_000_000; i  ) {
        sum  = hm.get(i);
    }
    System.out.println("The sum is: "   sum   ". Time elapsed: "   (System.currentTimeMillis() - t0)   " ms");
}

Results

Rust (release mode):

----- HashMap (with its default SipHash hasher) -----------
The sum is: 499999500000. Time elapsed: 0.149 sec
The sum is: 499999500000. Time elapsed: 0.140 sec
The sum is: 499999500000. Time elapsed: 0.167 sec
The sum is: 499999500000. Time elapsed: 0.150 sec
The sum is: 499999500000. Time elapsed: 0.261 sec
The sum is: 499999500000. Time elapsed: 0.189 sec
----- FnvHashMap (fnv 1.0.7) ------------------------------
The sum is: 499999500000. Time elapsed: 0.055 sec
The sum is: 499999500000. Time elapsed: 0.052 sec
The sum is: 499999500000. Time elapsed: 0.053 sec
The sum is: 499999500000. Time elapsed: 0.058 sec
The sum is: 499999500000. Time elapsed: 0.051 sec
The sum is: 499999500000. Time elapsed: 0.056 sec
----- FxHashMap (rustc-hash 1.1.0) ------------------------
The sum is: 499999500000. Time elapsed: 0.039 sec
The sum is: 499999500000. Time elapsed: 0.076 sec
The sum is: 499999500000. Time elapsed: 0.064 sec
The sum is: 499999500000. Time elapsed: 0.048 sec
The sum is: 499999500000. Time elapsed: 0.057 sec
The sum is: 499999500000. Time elapsed: 0.061 sec
----- HashMap/BuildNoHashHasher (nohash-hasher 0.2.0) -----
The sum is: 499999500000. Time elapsed: 0.004 sec
The sum is: 499999500000. Time elapsed: 0.003 sec
The sum is: 499999500000. Time elapsed: 0.003 sec
The sum is: 499999500000. Time elapsed: 0.003 sec
The sum is: 499999500000. Time elapsed: 0.003 sec
The sum is: 499999500000. Time elapsed: 0.003 sec

Java:

The sum is: 499999500000. Time elapsed: 49 ms  // see notes below
The sum is: 499999500000. Time elapsed: 41 ms  // see notes below
The sum is: 499999500000. Time elapsed: 18 ms
The sum is: 499999500000. Time elapsed: 29 ms
The sum is: 499999500000. Time elapsed: 19 ms
The sum is: 499999500000. Time elapsed: 23 ms

(With Java the first 1-2 runs are normally slower, as the JVM HotSpot still hasn't fully optimized the relevant piece of code.)

  • Related