Home > Net >  How to read data from HDFS with Flink in python
How to read data from HDFS with Flink in python

Time:10-25

I want to read data from HDFS with Flink in python I found it possible with Java or Scala : https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/dataset/formats/hadoop/

Indeed, Flink HDFS connector provides a Sink that writes partitioned files to any filesystem supported by Hadoop FileSystem.

I know I need to use InputFormat to try and specify that, but I cannot find a good guide to this in Python. there is no support to do that in python (pyFlink)

Please any help will be appreciated !!!

CodePudding user response:

I solved this with myself, just need to configure class_path of hadoop and create flink sql table ) WITH ( 'connector' = 'filesystem', 'path' = 'hdfs://namenode:9000/directory/', 'format' = 'json' )

  • Related