Home > front end >  what is the meaning of dfs.replication.max
what is the meaning of dfs.replication.max

Time:03-06

regarding to HDFS

what is the meaning of dfs.replication.max ?

from doc - https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml

its say only that - Maximal block replication

but still not understand this meaning

CodePudding user response:

Let's think through this. We have a min replication and this is typically 3.

Why have a max? Maybe you do a lot of maintenance and regularly take a node out of the cluster. You may end up by [taking nodes out] and [replacing nodes back in ] the cluster and it's reasonable to think 4 replicas of a block might happen with nodes leaving and returning. This might be a good situation due to your regular maintenance to have an extra copy hanging around so that maintenance doesn't always require lot of replication. You might accept 4 replicas as a max to replication. Taken to the extreme, this might get a little out of hand if you have 50 replicas of a file as this is just too much duplication and starts to eat into hdfs space. Think of the max as the time you might start to cull extra replicas.

  • Related