namenode default write data How do I select a datanode-CodePudding

namenode default write data How do I select a datanode datanode (a,b,c,d,e,f)

hdfs client (z) -> wirte data -> put->hello.txt nn->(? How to choose First datanode node.)

not Rack Awareness and BlockPlacementPolicy Related information

Is there any detailed documentation?

CodePudding user response：

I read the source code, know the relevant principle

/**
 * Given datanode address or host name, returns the DatanodeDescriptor for the
 * same, or if it doesn't find the datanode, it looks for a machine local and
 * then rack local datanode, if a rack local datanode is not possible either,
 * it returns the DatanodeDescriptor of any random node in the cluster.
 *
 * @param address hostaddress:transfer address
 * @return the best match for the given datanode
 */

// If we can't even choose rack local, just choose any node in the
      // cluster.
      if (node == null) {
        node = (DatanodeDescriptor)getNetworkTopology()
                                   .chooseRandom(NodeBase.ROOT);
      }

CodePudding user response：

You can't. That's transparent to the HDFS client. Block placement is exactly what you're asking. Files are separated into blocks. Only blocks are placed in datanodes, not files. The information that creates "a file" is only stored as metadata in the namenode (a filepath to a list of blocks on the datanodes)

https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html