Home > Net >  Accessing HDFS File system via Java API Vs. Java Runtime HDFS command
Accessing HDFS File system via Java API Vs. Java Runtime HDFS command

Time:11-01

What are the advantages and disadvantages of accessing HDFS file system via Java API Vs. invoking the HDFS command via Java Runtime?

The HDFS file system is based on Kerberos Authentication. In my previous organization, we used HDFS Java API to access HDFS file system but in my current organization, I was asked to invoke HDFS commands via Java Runtime call's. Is it fine to use this approach of invoking HDFS commands via Java Runtime call?

Runtime r = Runtime.getRuntime();
Process p = r.exec("hdfs dfs -copyFromLocal /tmp/localFile /tmp/hdfsDir/");
p.waitFor();

CodePudding user response:

This question basically boils down to "API vs CLI" and isn't necessarily Hadoop-specific.

At the end of the day both API calls and CLI calls will hit the same underlying code and do the same thing. The advantage of using an API is that you get endpoints and responses that are automatically in a format that Java can work with.

If you call hdfs commands from the CLI in Java then you have to manually parse the response as a string to figure out if it did what you expect. Compare that to using the HDFS API: any errors will throw an exception that you can handle.

  • Related