I have some corrupt files on HDFS because all block replicas are reported as missing. There are numerous data nodes down right now, so I want to know which ones to work on bringing back up which will rectify the missing blocks.
I have the list of files and blocks, is there a way I can show the "last known location" of the blocks (which datanode they were on)?
So far I've tried using hadoop fsck
, but it seems it can just report that all replicas are missing, not where they were. Also trying to use the hadoop oiv
with XML dump, but I can't see any block location information.
Is this information even available anywhere?
CodePudding user response:
The locations of blocks are not persisted. If you restart the Namenodes, they forget all the locations and only learn them from the datanode block reports.
So if blocks are missing, it means that no datanodes found them on their disks, and hence they were not reported to the Namenode.
One thing to check is that all DNs are reporting around the expected number of blocks, and there are no failed volumes reported and that each DN has the expected number of disks.
If the blocks are recent, you may get some luck grepping the namenode logs for the block_ids that are missing. There you may find where they were originally allocated, but they may have moved since then if the balancer was run etc.