Home > Software engineering >  Transfer files from HDFS dir to sftp server
Transfer files from HDFS dir to sftp server

Time:11-24

I am trying to transfer all the part* files from a directory directly from HDFS dir to sftp server. All the files in hdfs folder is pretty huge, so I do not want to copy them to local file system.

The current setup is

hdfs dfs -text "<HDFS_DIR>/part*" > localfile

curl "<sftp_username>:" --key "<private_key_file_path>" --pubkey "<public_key_file_path>" \
    --upload-file local_file "sftp://<SFTP_HOST>/<Upload_dir>"

How can I upload the files directly from HDFS to sftp server path without writing the file to local filesystem.

I considered the following options

  1. scoop with sftp (Did not find enough resources) - https://sqoop.apache.org/docs/1.99.7/user/connectors/Connector-SFTP.html
  2. Copy each part file to local fs and move it to sftp server (inefficient)
  3. hadoop distcp with sftp doesn't work in cdh5. I am using CDH-5.16.2

Please let me know which is the best way to accomplish this. Thanks!

CodePudding user response:

maybe you can pipe hdfs's output directly to curl for upload, by using --upload-file . or --upload-file - , eg

hdfs dfs -text "<HDFS_DIR>/part*" | curl "<sftp_username>:" --key "<private_key_file_path>" --pubkey "<public_key_file_path>"
--upload-file . "sftp://<SFTP_HOST>/<Upload_dir>"

about the difference between . and - the docs says

Use the file name "-" (a single dash) to use stdin instead of a given file. Alternately, the file name "." (a single period) may be specified instead of "-" to use stdin in non-blocking mode to allow reading server output while stdin is being uploaded.

which sounds to me like curl may attempt to put the whole file in ram, or at least in a stdin buffer, before starting the upload, so . sounds safer than - if you expect to deal with large files..

CodePudding user response:

You could probably do it like this.

hdfs dfs -cat <HDFS_DIR>/part* | ssh <sftp_username>:<sftp_hostname> 'cat - > <Upload_dir>/<file_name>'
  • Related