Home > Software design >  hadoop get files from existing archived file in hdfs
hadoop get files from existing archived file in hdfs

Time:01-13

I have a directory "SmallFiles" that contains 8 files, I archived them using "hadoop archive -archiveName myArch.har -p /Files/SmallFiles /Files" then deleted the original files. I want to know how to extract files again?

When I download it I get these 3 files "index, masterindex, part-0"

CodePudding user response:

You need to access archived files via the har:// URI.

Thus, files archived with: hadoop archive -archiveName foo.har -p /user/hadoop dir1 dir2 /user/zoo would be accessed with hadoop dfs -lsr har:///user/zoo/foo.har/

I think the docs are straightforward here: https://hadoop.apache.org/docs/current/hadoop-archives/HadoopArchives.html

  • Related