how do we copy files from Hadoop to abfs (azure blob file system) I want to copy from Hadoop fs to abfs file system but it throws an error this is the command I ran
hdfs dfs -ls abfs://....
ls: No FileSystem for scheme "abfs"
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem not found
any idea how this can be done ?
CodePudding user response:
In the core-site.xml
you need to add a config property for fs.abfs.impl
with value org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem
, and then add any other related authentication configurations it may need.
More details on installation/configuration here - https://hadoop.apache.org/docs/current/hadoop-azure/abfs.html
CodePudding user response:
the abfs binding is already in core-default.xml for any release with the abfs client present. however, the hadoop-azure jar and dependency is not in the hadoop common/lib dir where it is needed (it is in HDI, CDH, but not the apache one)
you can tell the hadoop script to pick it and its dependencies up by setting the HADOOP_OPTIONAL_TOOLS
env var; you can do this in ~/.hadoop-env
; just try on your command line first
export HADOOP_OPTIONAL_TOOLS="hadoop-azure,hadoop-aws"
after doing that, download the latest cloudstore jar and use its storediag command to attempt to connect to an abfs URL; it's the place to start debugging classpath and config issues