I have a java application that starts up a spark worker:
Worker.startRpcEnvAndEndpoint(args.host(), args.port(), args.webUiPort(), args.cores(), args.memory(), args.masters(), args.workDir(), scala.Option.empty(), conf);
(see https://books.japila.pl/spark-standalone-internals/Worker/#externalshuffleservice)
I would now like to set up an IP access filter so that I have a hard coded list of IP addresses that can access this service.
Is there a way to configure the Java program above to provide such an IP access filter?
CodePudding user response:
I am not aware of Spark internal networking, but from a server bind-address perspective, the best you can do is to isolate the bind address on a specific interface/subnet - this would start with your args.host()
If you want to restrict to specific IPs within that subnet, you'll need to work with the OS firewall, maybe managing that from code as well, but not with the Spark libraries.
Then going further - restricting to certain clients, rather than machines, you could provide certificates into certain machines, or users, or otherwise encode IP addresses into some auth protocol, then enforce ACL policies for Spark. Perhaps you can use Kerberos for this, too