How to balance load in Cassandra?-CodePudding

I'm using Elassandra 6.2.3 I have set a cluster of 3 nodes and created a keyspace with replication factor of 2.

I'm using Murmur3Partitioner and num_tokens=8 in Cassandra configuration.

CREATE KEYSPACE mykeyspace WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': '2'} AND durable_writes = true;

DESC mykeyspace;
CREATE TABLE mykeyspace.mytable(
    f1 text,
    f2 timestamp,
    f3 text,
    f4 text,
    f5 int,
    PRIMARY KEY (f1, f2)
) WITH CLUSTERING ORDER BY (f2 DESC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';
CREATE CUSTOM INDEX elastic_mytable_idx ON mykeyspace.mytable () USING 'org.elassandra.index.ExtendedElasticSecondaryIndex';

here the ring:

$ nodetool ring mykeyspace

Datacenter: DC1
==========
Address    Rack        Status State   Load            Owns                Token
                                                                          8046893569169204194
10.8.0.6   r1          Up     Normal  30.77 GiB       86.15%              -7885825037800730315
10.8.0.1   r1          Up     Normal  17.55 GiB       64.24%              -7261086042602187969
10.8.0.14  r1          Up     Normal  7.36 GiB        49.61%              -7247943966600989463
10.8.0.1   r1          Up     Normal  17.55 GiB       64.24%              -7228717159480176131
10.8.0.1   r1          Up     Normal  17.55 GiB       64.24%              -6939207504674930480
10.8.0.6   r1          Up     Normal  30.77 GiB       86.15%              -6158757762234956967
10.8.0.14  r1          Up     Normal  7.36 GiB        49.61%              -4699623277895141955
10.8.0.14  r1          Up     Normal  7.36 GiB        49.61%              -4269715227726417275
10.8.0.6   r1          Up     Normal  30.77 GiB       86.15%              -3148156422280710025
10.8.0.14  r1          Up     Normal  7.36 GiB        49.61%              -2567971232125784764
10.8.0.14  r1          Up     Normal  7.36 GiB        49.61%              -2187229040967677675
10.8.0.6   r1          Up     Normal  30.77 GiB       86.15%              -2058807466377445130
10.8.0.1   r1          Up     Normal  17.55 GiB       64.24%              -1181919914747129817
10.8.0.6   r1          Up     Normal  30.77 GiB       86.15%              695306942662545127
10.8.0.1   r1          Up     Normal  17.55 GiB       64.24%              1989050017548537421
10.8.0.14  r1          Up     Normal  7.36 GiB        49.61%              2881433693910708029
10.8.0.6   r1          Up     Normal  30.77 GiB       86.15%              3454959670543032324
10.8.0.14  r1          Up     Normal  7.36 GiB        49.61%              3833350227892101457
10.8.0.6   r1          Up     Normal  30.77 GiB       86.15%              4855735318033934682
10.8.0.6   r1          Up     Normal  30.77 GiB       86.15%              6288034337780481749
10.8.0.1   r1          Up     Normal  17.55 GiB       64.24%              6495870875989416002
10.8.0.1   r1          Up     Normal  17.55 GiB       64.24%              6853344637592889364
10.8.0.14  r1          Up     Normal  7.36 GiB        49.61%              6911496393497851249
10.8.0.1   r1          Up     Normal  17.55 GiB       64.24%              8046893569169204194

I have created a testing program that generates 1M of random data and send them to Cassandra via nodejs's cassandra-driver library. The testing program generates data with roughly 2900 different partition keys (f1) and different clustering keys (f2).

The result is that data are distributed like this:

$ nodetool status mykeyspace
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.8.0.14  12.47 GiB  8            49.6%             c5a17dbd-40ef-4f58-b132-0d977a92f1a1  r1
UN  10.8.0.1   17.55 GiB  8            64.2%             f088d009-bd97-4e35-9f20-60006a68b363  r1
UN  10.8.0.6   33.49 GiB  8            86.1%             e82191ad-9d9f-459f-9da0-2b0457ad6611  r1

Why does one node have almost the double of the load of the other 2 nodes?

Thanks

CodePudding user response：

In Cassandra, vnodes tokens are allocated randomly, therefore the resulting token ranges may differ in size.

In practice, this effect is mitigated by the number of tokens being high. If you want a better distribution, you could set the num_token to 256.

But there is a drawback, setting a high number of tokens will increase the complexity of search requests in Elassandra, because the token range is used to filter out duplicate results. See Elassandra search path documentation for more information.

I would recommend not settings a num_token higher than 16, and ideally to benchmark different settings to fit your specific needs.