Home > Software design >  Apache Flink Operator - enable azure-fs-hadoop
Apache Flink Operator - enable azure-fs-hadoop

Time:07-04

I am trying to perform a flink job, using Flink Operator (https://github:com/apache/flink-kubernetes-operator) on k8s, that using uses a connection to Azure Blob Storage described here: https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/filesystems/azure/

Following the guideline I need to copy the jar file flink-azure-fs-hadoop-1.15.0.jar from one directory to another.

I have already tried to do it via podTemplate and command functionality, but unfortunately it does not work, and the file does not appear in the destination directory.

Can you guide me on how to do it properly? Below you can find my FlinkDeployment file.

apiVersion: flink.apache.org/v1beta1
kind: FlinkDeployment
metadata:
  namespace: flink
  name: basic-example
spec:
  image: flink:1.15
  flinkVersion: v1_15
  flinkConfiguration:
    taskmanager.numberOfTaskSlots: "2"
  serviceAccount: flink
  podTemplate:
    apiVersion: v1
    kind: Pod
    metadata:
      name: pod-template
    spec:
      serviceAccount: flink
      containers:
        - name: flink-main-container
          volumeMounts:
            - mountPath: /opt/flink/data
              name: flink-data
#          command:
#            - "touch"
#            - "/tmp/test.txt"
      volumes:
        - name: flink-data
          emptyDir: { }

  jobManager:
    resource:
      memory: "2048m"
      cpu: 1
    podTemplate:
      apiVersion: v1
      kind: Pod
      metadata:
        name: job-manager-pod-template
      spec:
        initContainers:
          - name: fetch-jar
            image: cirrusci/wget
            volumeMounts:
              - mountPath: /opt/flink/data
                name: flink-data
            command:
            - "wget"
            - "LINK_TO_CUSTOM_JAR_FILE_ON_AZURE_BLOB_STORAGE"
            - "-O"
            - "/opt/flink/data/test.jar"
        containers:
          - name: flink-main-container
            command:
              - "touch"
              - "/tmp/test.txt"
  taskManager:
    resource:
      memory: "2048m"
      cpu: 1
  job:
    jarURI: local:///opt/flink/data/test.jar
    parallelism: 2
    upgradeMode: stateless
    state: running
  ingress:
    template: "CUSTOM_LINK_TO_AZURE"
    annotations:
      cert-manager.io/cluster-issuer: letsencrypt
      kubernetes.io/ingress.allow-http: 'false'
      traefik.ingress.kubernetes.io/router.entrypoints: websecure
      traefik.ingress.kubernetes.io/router.tls: 'true'
      traefik.ingress.kubernetes.io/router.tls.options: default

CodePudding user response:

Since you are using the stock Flink 1.15 image this Azure filesystem plugin comes built-in. You can enable it via setting the ENABLE_BUILT_IN_PLUGINS environment variable.

spec:
  podTemplate:
      containers:
        # Do not change the main container name
        - name: flink-main-container
          env:
            - name: ENABLE_BUILT_IN_PLUGINS
              value: flink-azure-fs-hadoop-1.15.0.jar

https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/standalone/docker/#using-filesystem-plugins

  • Related